Diversity in software engineering research

Authors: Meiyappan Nagappan Thomas Zimmermann Christian Bird

Venue: FSE   9th joint meeting of the European Software Engineering Conference and the ACM SIGSOFT Symposium on the Foundations of Software Engineering (ESEC/FSE), pp. 466--476, 2013

Year: 2013

Abstract: One of the goals of software engineering research is to achieve generality: Are the phenomena found in a few projects reflective of others? Will a technique perform as well on projects other than the projects it is evaluated on? While it is common sense to select a sample that is representative of a population, the importance of diversity is often overlooked, yet as important. In this paper, we combine ideas from representativeness and diversity and introduce a measure called sample coverage, defined as the percentage of projects in a population that are similar to the given sample. We introduce algorithms to compute the sample coverage for a given set of projects and to select the projects that increase the coverage the most. We demonstrate our technique on research presented over the span of two years at ICSE and FSE with respect to a population of 20,000 active open source projects monitored by Ohloh.net. Knowing the coverage of a sample enhances our ability to reason about the findings of a study. Furthermore, we propose reporting guidelines for research: in addition to coverage scores, papers should discuss the target population of the research (universe) and dimensions that potentially can influence the outcomes of a research (space).

Preprint: PDF

BibTeX:

@inproceedings{meiyappannagappan2013diser,
    author = "Meiyappan Nagappan and Thomas Zimmermann and Christian Bird",
    title = "Diversity in software engineering research",
    year = "2013",
    pages = "466--476",
    booktitle = "Proceedings of the 2013 9th joint meeting on foundations of software engineering"
}

Plain Text:

Meiyappan Nagappan, Thomas Zimmermann, and Christian Bird, "Diversity in software engineering research," 9th joint meeting of the European Software Engineering Conference and the ACM SIGSOFT Symposium on the Foundations of Software Engineering (ESEC/FSE), pp. 466--476