Abstract
Mistakes in binary conditions are a source of error in many software systems. They happen when developers use, e.g., < or > instead of <= or >=. These boundary mistakes are hard to find and impose manual, labor-intensive work for software developers. While previous research has been proposing solutions to identify errors in boundary conditions, the problem remains open. In this paper, we explore the effectiveness of deep learning models in learning and predicting mistakes in boundary conditions. We train different models on approximately 1.6M examples with faults in different boundary conditions. We achieve a precision of 85% and a recall of 84% on a balanced dataset, but lower numbers in an imbalanced dataset. We also perform tests on 41 real-world boundary condition bugs found from GitHub, where the model shows only a modest performance. Finally, we test the model on a large-scale Java code base from Adyen, our industrial partner. The model reported 36 buggy methods, but none of them were confirmed by developers.
Original language | English |
---|---|
Title of host publication | 2021 IEEE/ACM 18th International Conference on Mining Software Repositories (MSR) |
Editors | L. O'Conner |
Place of Publication | Piscataway |
Publisher | IEEE |
Pages | 58-67 |
Number of pages | 10 |
ISBN (Electronic) | 978-1-7281-8710-5 |
ISBN (Print) | 978-1-6654-2985-6 |
DOIs | |
Publication status | Published - 2021 |
Event | 2021 IEEE/ACM 18th International Conference on Mining Software Repositories (MSR) - Virtual at Madrid, Spain Duration: 17 May 2021 → 19 May 2021 Conference number: 18th |
Conference
Conference | 2021 IEEE/ACM 18th International Conference on Mining Software Repositories (MSR) |
---|---|
Abbreviated title | MSR21 |
Country/Territory | Spain |
City | Virtual at Madrid |
Period | 17/05/21 → 19/05/21 |
Bibliographical note
Accepted author manuscriptKeywords
- Boundary testing
- Deep learning for software engineering
- Machine learning for software engineering
- Software testing