Learning Off-By-One Mistakes: An Empirical Study

Hendrig Sellik, Onno van Paridon, Georgios Gousios, Maurício Aniche

Research output: Chapter in Book/Conference proceedings/Edited volumeConference contributionScientificpeer-review

3 Citations (Scopus)
70 Downloads (Pure)

Abstract

Mistakes in binary conditions are a source of error in many software systems. They happen when developers use, e.g., < or > instead of <= or >=. These boundary mistakes are hard to find and impose manual, labor-intensive work for software developers. While previous research has been proposing solutions to identify errors in boundary conditions, the problem remains open. In this paper, we explore the effectiveness of deep learning models in learning and predicting mistakes in boundary conditions. We train different models on approximately 1.6M examples with faults in different boundary conditions. We achieve a precision of 85% and a recall of 84% on a balanced dataset, but lower numbers in an imbalanced dataset. We also perform tests on 41 real-world boundary condition bugs found from GitHub, where the model shows only a modest performance. Finally, we test the model on a large-scale Java code base from Adyen, our industrial partner. The model reported 36 buggy methods, but none of them were confirmed by developers.
Original languageEnglish
Title of host publication2021 IEEE/ACM 18th International Conference on Mining Software Repositories (MSR)
EditorsL. O'Conner
Place of PublicationPiscataway
PublisherIEEE
Pages58-67
Number of pages10
ISBN (Electronic)978-1-7281-8710-5
ISBN (Print)978-1-6654-2985-6
DOIs
Publication statusPublished - 2021
Event2021 IEEE/ACM 18th International Conference on Mining Software Repositories (MSR) - Virtual at Madrid, Spain
Duration: 17 May 202119 May 2021
Conference number: 18th

Conference

Conference2021 IEEE/ACM 18th International Conference on Mining Software Repositories (MSR)
Abbreviated titleMSR21
Country/TerritorySpain
CityVirtual at Madrid
Period17/05/2119/05/21

Bibliographical note

Accepted author manuscript

Keywords

  • Boundary testing
  • Deep learning for software engineering
  • Machine learning for software engineering
  • Software testing

Fingerprint

Dive into the research topics of 'Learning Off-By-One Mistakes: An Empirical Study'. Together they form a unique fingerprint.

Cite this