What's hot and what's not: Windowed developer topic analysis

Authors: Abram Hindle Michael W. Godfrey Richard C. Holt

Venue: ICSME   2009 IEEE International Conference on Software Maintenance, pp. 339-348, 2009

Year: 2009

Abstract: As development on a software project progresses, developers shift their focus between different topics and tasks many times. Managers and newcomer developers often seek ways of understanding what tasks have recently been worked on and how much effort has gone into each; for example, a manager might wonder what unexpected tasks occupied their team's attention during a period when they were supposed to have been implementing new features. Tools such as Latent Dirichlet Allocation (LDA) and Latent Semantic Indexing (LSI) can be used to extract a set of independent topics from a corpus of commit-log comments. Previous work in the area has created a single set of topics by analyzing comments from the entire lifetime of the project. In this paper, we propose windowing the topic analysis to give a more nuanced view of the system's evolution. By using a defined time-window of, for example, one month, we can track which topics come and go over time, and which ones recur. We propose visualizations of this model that allows us to explore the evolving stream of topics of development occurring over time. We demonstrate that windowed topic analysis offers advantages over topic analysis applied to a project's lifetime because many topics are quite local.

BibTeX:

@inproceedings{abramhindle2009whawnwdta,
    author = "Abram Hindle and Michael W. Godfrey and Richard C. Holt",
    title = "What's hot and what's not: Windowed developer topic analysis",
    year = "2009",
    pages = "339-348",
    booktitle = "Proceedings of 2009 IEEE International Conference on Software Maintenance"
}

Plain Text:

Abram Hindle, Michael W. Godfrey, and Richard C. Holt, "What's hot and what's not: Windowed developer topic analysis," 2009 IEEE International Conference on Software Maintenance, pp. 339-348