STACC: Code Comment Classification using SentenceTransformers

Research output: Chapter in Book/Conference proceedings/Edited volumeConference contributionScientificpeer-review

Abstract

Code comments are a key resource for information about software artefacts. Depending on the use case, only some types of comments are useful. Thus, automatic approaches to classify these comments have been proposed. In this work, we address this need by proposing, STACC, a set of SentenceTransformers-based binary classifiers. These lightweight classifiers are trained and tested on the NLBSE Code Comment Classification tool competition dataset, and surpass the baseline by a significant margin, achieving an average F1 score of 0.74 against the baseline of 0.31, which is an improvement of 139%. A replication package, as well as the models themselves, are publicly available.
Original languageEnglish
Title of host publicationThe 2nd Intl. Workshop on NL-based Software Engineering
Number of pages4
Publication statusAccepted/In press - 2023
Event2023 IEEE/ACM 2nd International Workshop on Natural Language-Based Software Engineering (NLBSE) - Melbourne, Australia
Duration: 14 May 202320 May 2023
Conference number: 2
https://nlbse2023.github.io/

Workshop

Workshop2023 IEEE/ACM 2nd International Workshop on Natural Language-Based Software Engineering (NLBSE)
Abbreviated titleNLBSE 2023
Country/TerritoryAustralia
CityMelbourne
Period14/05/2320/05/23
Internet address

Fingerprint

Dive into the research topics of 'STACC: Code Comment Classification using SentenceTransformers'. Together they form a unique fingerprint.

Cite this