Abstract
Code comments are a key resource for information about software artefacts. Depending on the use case, only some types of comments are useful. Thus, automatic approaches to classify these comments have been proposed. In this work, we address this need by proposing, STACC, a set of SentenceTransformers-based binary classifiers. These lightweight classifiers are trained and tested on the NLBSE Code Comment Classification tool competition dataset, and surpass the baseline by a significant margin, achieving an average F1 score of 0.74 against the baseline of 0.31, which is an improvement of 139%. A replication package, as well as the models themselves, are publicly available.
Original language | English |
---|---|
Title of host publication | The 2nd Intl. Workshop on NL-based Software Engineering |
Number of pages | 4 |
Publication status | Accepted/In press - 2023 |
Event | 2023 IEEE/ACM 2nd International Workshop on Natural Language-Based Software Engineering (NLBSE) - Melbourne, Australia Duration: 14 May 2023 → 20 May 2023 Conference number: 2 https://nlbse2023.github.io/ |
Workshop
Workshop | 2023 IEEE/ACM 2nd International Workshop on Natural Language-Based Software Engineering (NLBSE) |
---|---|
Abbreviated title | NLBSE 2023 |
Country/Territory | Australia |
City | Melbourne |
Period | 14/05/23 → 20/05/23 |
Internet address |