Reading beside the lines: Using indentation to rank revisions by complexity: Science of Computer Programming: Vol 74, No 7

Authors: Abram Hindle Michael W. Godfrey Richard C. Holt

Venue: Science of Computer Programming, Vol. 74, No. 7, pp. 414–429, 2009

Year: 2009

Abstract: Maintainers often face the daunting task of wading through a collection of both new and old revisions, trying to ferret out those that warrant detailed inspection. Perhaps the most obvious way to rank revisions is by lines of code (LOC); this technique has the advantage of being both simple and fast. However, most revisions are quite small, and so we would like a way of distinguishing between simple and complex changes of equal size. Classical complexity metrics, such as Halstead's and McCabe's, could be used but they are hard to apply to code fragments of different programming languages. We propose a language-independent approach to ranking revisions based on the indentation of their code fragments. We use the statistical moments of indentation as a lightweight and revision/diff friendly metric to proxy classical complexity metrics. We found that ranking revisions by the variance and summation of indentation was very similar to ranking revisions by traditional complexity measures since these measures correlate with both Halstead and McCabe complexity; this was evaluated against the CVS histories of 278 active and popular SourceForge projects. Thus, we conclude that measuring indentation alone can serve as a cheap and accurate proxy for computing the code complexity of revisions.

BibTeX:

@article{abramhindle2009rbtluitrrbcsocpv7n7,
    author = "Abram Hindle and Michael W. Godfrey and Richard C. Holt",
    title = "Reading beside the lines: Using indentation to rank revisions by complexity: Science of Computer Programming: Vol 74, No 7",
    year = "2009",
    pages = "414–429",
    journal = "Science of Computer Programming",
    volume = "74",
    number = "7"
}

Plain Text:

Abram Hindle, Michael W. Godfrey, and Richard C. Holt, "Reading beside the lines: Using indentation to rank revisions by complexity: Science of Computer Programming: Vol 74, No 7," Science of Computer Programming, pp. 414–429