Augmented Fine-Grained Defect Prediction for Code Review

L. Pascarella

doi:10.4233/uuid:e553e8ae-73be-4718-ab93-81f466db7347

Augmented Fine-Grained Defect Prediction for Code Review

L. Pascarella

Software Engineering

Research output: Thesis › Dissertation (TU Delft)

364 Downloads (Pure)

Abstract

Code review is a widely used technique to support software quality. It is a manual activity, often subject to repetitive and tedious tasks that increase the mental load of reviewers and compromise their effectiveness. The developer-centered nature of code review can represent a bottleneck that does not scale in large systems with the consequence of com- promising firms’ profits. This challenge has led to an entire line of research on code review improvement.
In this thesis, we present our results and remarks on the effectiveness of using fine- grained defect prediction in code review while investigating what are the information needs that lead a proper code review. We started reimplementing the state of the art of defect prediction to understand its replicability; then, we evaluated this model in a more realistic scenario that is typically considered. To improve defect prediction techniques, we come up with a fine-grained just-in-time defect prediction model that anticipates the prediction at commit time and reduces the granularity at the file level. After that, we explored how to improve further prediction performance by using alternative sources of information. We conducted a comprehensive investigation of code comments written by both open and closed source developers. Finally, to understand how to improve code review further, we explored from a reviewers’ perspective what is the information that reviewers need to lead a proper code review.
Our findings show that the state of the art of defect prediction, when evaluated in a realistic scenario, cannot be directly used to support code review. Furthermore, we assessed that alternative sets of metrics, anticipated feedback, and fine-grained suggestions represent independent directions to improve prediction performance. Finally, we discovered that research must create intelligent tools that other than predict defects must satisfy actual reviewers’ needs, such as expert selection, splittable changes, realtime communication, and self summarization of changes.

Original language	English
Qualification	Doctor of Philosophy
Awarding Institution	Delft University of Technology
Supervisors/Advisors	van Deursen, A., Supervisor Bacchelli, A., Supervisor
Award date	2 Sept 2020
DOIs	https://doi.org/10.4233/uuid:e553e8ae-73be-4718-ab93-81f466db7347
Publication status	Published - 2020

Keywords

Code review
defect prediction
software analytics

Access to Document

10.4233/uuid:e553e8ae-73be-4718-ab93-81f466db7347

dissertation-luca-pascarella

Cite this

@phdthesis{e553e8ae73be4718ab9381f466db7347,

title = "Augmented Fine-Grained Defect Prediction for Code Review",

abstract = "Code review is a widely used technique to support software quality. It is a manual activity, often subject to repetitive and tedious tasks that increase the mental load of reviewers and compromise their effectiveness. The developer-centered nature of code review can represent a bottleneck that does not scale in large systems with the consequence of com- promising firms{\textquoteright} profits. This challenge has led to an entire line of research on code review improvement.In this thesis, we present our results and remarks on the effectiveness of using fine- grained defect prediction in code review while investigating what are the information needs that lead a proper code review. We started reimplementing the state of the art of defect prediction to understand its replicability; then, we evaluated this model in a more realistic scenario that is typically considered. To improve defect prediction techniques, we come up with a fine-grained just-in-time defect prediction model that anticipates the prediction at commit time and reduces the granularity at the file level. After that, we explored how to improve further prediction performance by using alternative sources of information. We conducted a comprehensive investigation of code comments written by both open and closed source developers. Finally, to understand how to improve code review further, we explored from a reviewers{\textquoteright} perspective what is the information that reviewers need to lead a proper code review.Our findings show that the state of the art of defect prediction, when evaluated in a realistic scenario, cannot be directly used to support code review. Furthermore, we assessed that alternative sets of metrics, anticipated feedback, and fine-grained suggestions represent independent directions to improve prediction performance. Finally, we discovered that research must create intelligent tools that other than predict defects must satisfy actual reviewers{\textquoteright} needs, such as expert selection, splittable changes, realtime communication, and self summarization of changes.",

keywords = "Code review, defect prediction, software analytics",

author = "L. Pascarella",

year = "2020",

doi = "10.4233/uuid:e553e8ae-73be-4718-ab93-81f466db7347",

language = "English",

type = "Dissertation (TU Delft)",

school = "Delft University of Technology",

}

TY - THES

T1 - Augmented Fine-Grained Defect Prediction for Code Review

AU - Pascarella, L.

PY - 2020

Y1 - 2020

N2 - Code review is a widely used technique to support software quality. It is a manual activity, often subject to repetitive and tedious tasks that increase the mental load of reviewers and compromise their effectiveness. The developer-centered nature of code review can represent a bottleneck that does not scale in large systems with the consequence of com- promising firms’ profits. This challenge has led to an entire line of research on code review improvement.In this thesis, we present our results and remarks on the effectiveness of using fine- grained defect prediction in code review while investigating what are the information needs that lead a proper code review. We started reimplementing the state of the art of defect prediction to understand its replicability; then, we evaluated this model in a more realistic scenario that is typically considered. To improve defect prediction techniques, we come up with a fine-grained just-in-time defect prediction model that anticipates the prediction at commit time and reduces the granularity at the file level. After that, we explored how to improve further prediction performance by using alternative sources of information. We conducted a comprehensive investigation of code comments written by both open and closed source developers. Finally, to understand how to improve code review further, we explored from a reviewers’ perspective what is the information that reviewers need to lead a proper code review.Our findings show that the state of the art of defect prediction, when evaluated in a realistic scenario, cannot be directly used to support code review. Furthermore, we assessed that alternative sets of metrics, anticipated feedback, and fine-grained suggestions represent independent directions to improve prediction performance. Finally, we discovered that research must create intelligent tools that other than predict defects must satisfy actual reviewers’ needs, such as expert selection, splittable changes, realtime communication, and self summarization of changes.

AB - Code review is a widely used technique to support software quality. It is a manual activity, often subject to repetitive and tedious tasks that increase the mental load of reviewers and compromise their effectiveness. The developer-centered nature of code review can represent a bottleneck that does not scale in large systems with the consequence of com- promising firms’ profits. This challenge has led to an entire line of research on code review improvement.In this thesis, we present our results and remarks on the effectiveness of using fine- grained defect prediction in code review while investigating what are the information needs that lead a proper code review. We started reimplementing the state of the art of defect prediction to understand its replicability; then, we evaluated this model in a more realistic scenario that is typically considered. To improve defect prediction techniques, we come up with a fine-grained just-in-time defect prediction model that anticipates the prediction at commit time and reduces the granularity at the file level. After that, we explored how to improve further prediction performance by using alternative sources of information. We conducted a comprehensive investigation of code comments written by both open and closed source developers. Finally, to understand how to improve code review further, we explored from a reviewers’ perspective what is the information that reviewers need to lead a proper code review.Our findings show that the state of the art of defect prediction, when evaluated in a realistic scenario, cannot be directly used to support code review. Furthermore, we assessed that alternative sets of metrics, anticipated feedback, and fine-grained suggestions represent independent directions to improve prediction performance. Finally, we discovered that research must create intelligent tools that other than predict defects must satisfy actual reviewers’ needs, such as expert selection, splittable changes, realtime communication, and self summarization of changes.

KW - Code review

KW - defect prediction

KW - software analytics

U2 - 10.4233/uuid:e553e8ae-73be-4718-ab93-81f466db7347

DO - 10.4233/uuid:e553e8ae-73be-4718-ab93-81f466db7347

M3 - Dissertation (TU Delft)

ER -

Augmented Fine-Grained Defect Prediction for Code Review

Abstract

Keywords

Access to Document

Fingerprint

Cite this