Data mining in software engineering

M. Halkidi, D. Spinellis, G. Tsatsaronis, M. Vazirgiannis

Research output: Contribution to journalReview articlepeer-review

22 Citations (Scopus)


The increased availability of data created as part of the software development process allows us to apply novel analysis techniques on the data and use the results to guide the process's optimization. In this paper we describe various data sources and discuss the principles and techniques of data mining as applied on software engineering data. Data that can be mined is generated by most parts of the development process: requirements elicitation, development analysis, testing, debugging, and maintenance. Based on this classification we survey the mining approaches that have been used and categorize them according to the corresponding parts of the development process and the task they assist. Thus the survey provides researchers with a concise overview of data mining techniques applied to software engineering data, and aids practitioners on the selection of appropriate data mining techniques for their work.

Original languageEnglish
Pages (from-to)413-441
Number of pages29
JournalIntelligent Data Analysis
Issue number3
Publication statusPublished - 2011
Externally publishedYes


  • Data mining techniques
  • KDD methods
  • mining software engineering data

Fingerprint Dive into the research topics of 'Data mining in software engineering'. Together they form a unique fingerprint.

Cite this