Breaking the Silence: the Threats of Using LLMs in Software Engineering

J. Sallou; T. Durieux; A. Panichella

Breaking the Silence: the Threats of Using LLMs in Software Engineering

Software Engineering

Research output: Chapter in Book/Conference proceedings/Edited volume › Conference contribution › Scientific › peer-review

416 Downloads (Pure)

Abstract

Large Language Models (LLMs) have gained considerable traction within the Software Engineering (SE) community, impacting various SE tasks from code completion to test generation, from program repair to code summarization. Despite their promise, researchers must still be careful as numerous intricate factors can influence the outcomes of experiments involving LLMs.
This paper initiates an open discussion on potential threats to the validity of LLM-based research including issues such as closed-source models, possible data leakage between LLM training data and research evaluation, and the reproducibility of LLM-based findings.
In response, this paper proposes a set of guidelines tailored for SE researchers and Language Model (LM) providers to mitigate these concerns.
The implications of the guidelines are illustrated using existing good practices followed by LLM providers and a practical example for SE researchers in the context of test case generation.

Original language	English
Title of host publication	ACM/IEEE 46th International Conference on Software Engineering - New Ideas and Emerging Results
Publisher	ACM/IEEE
Number of pages	5
ISBN (Electronic)	979-8-4007-0500-7/24/04
Publication status	Accepted/In press - Jan 2024
Event	ACM/IEEE 46th International Conference on Software Engineering - Lisbon, Lisbon, Portugal Duration: 14 Apr 2024 → 20 Apr 2024 Conference number: 46 https://conf.researchr.org/home/icse-2024

Conference

Conference	ACM/IEEE 46th International Conference on Software Engineering
Abbreviated title	ICSE '24
Country/Territory	Portugal
City	Lisbon
Period	14/04/24 → 20/04/24
Internet address	https://conf.researchr.org/home/icse-2024

Keywords

Large Language Models
Artificial Intelligence
Empirical Software Engineering
Empirical Software Validation

Cite this

@inproceedings{c88e9d83f2564a5188ee72c347cc260f,

title = "Breaking the Silence: the Threats of Using LLMs in Software Engineering",

abstract = "Large Language Models (LLMs) have gained considerable traction within the Software Engineering (SE) community, impacting various SE tasks from code completion to test generation, from program repair to code summarization. Despite their promise, researchers must still be careful as numerous intricate factors can influence the outcomes of experiments involving LLMs. This paper initiates an open discussion on potential threats to the validity of LLM-based research including issues such as closed-source models, possible data leakage between LLM training data and research evaluation, and the reproducibility of LLM-based findings.In response, this paper proposes a set of guidelines tailored for SE researchers and Language Model (LM) providers to mitigate these concerns.The implications of the guidelines are illustrated using existing good practices followed by LLM providers and a practical example for SE researchers in the context of test case generation.",

keywords = "Large Language Models, Artificial Intelligence, Empirical Software Engineering, Empirical Software Validation",

author = "J. Sallou and T. Durieux and A. Panichella",

year = "2024",

month = jan,

language = "English",

booktitle = "ACM/IEEE 46th International Conference on Software Engineering - New Ideas and Emerging Results",

publisher = "ACM/IEEE",

note = "ACM/IEEE 46th International Conference on Software Engineering, ICSE '24 ; Conference date: 14-04-2024 Through 20-04-2024",

url = "https://conf.researchr.org/home/icse-2024",

}

TY - GEN

T1 - Breaking the Silence: the Threats of Using LLMs in Software Engineering

AU - Sallou, J.

AU - Durieux, T.

AU - Panichella, A.

N1 - Conference code: 46

PY - 2024/1

Y1 - 2024/1

N2 - Large Language Models (LLMs) have gained considerable traction within the Software Engineering (SE) community, impacting various SE tasks from code completion to test generation, from program repair to code summarization. Despite their promise, researchers must still be careful as numerous intricate factors can influence the outcomes of experiments involving LLMs. This paper initiates an open discussion on potential threats to the validity of LLM-based research including issues such as closed-source models, possible data leakage between LLM training data and research evaluation, and the reproducibility of LLM-based findings.In response, this paper proposes a set of guidelines tailored for SE researchers and Language Model (LM) providers to mitigate these concerns.The implications of the guidelines are illustrated using existing good practices followed by LLM providers and a practical example for SE researchers in the context of test case generation.

AB - Large Language Models (LLMs) have gained considerable traction within the Software Engineering (SE) community, impacting various SE tasks from code completion to test generation, from program repair to code summarization. Despite their promise, researchers must still be careful as numerous intricate factors can influence the outcomes of experiments involving LLMs. This paper initiates an open discussion on potential threats to the validity of LLM-based research including issues such as closed-source models, possible data leakage between LLM training data and research evaluation, and the reproducibility of LLM-based findings.In response, this paper proposes a set of guidelines tailored for SE researchers and Language Model (LM) providers to mitigate these concerns.The implications of the guidelines are illustrated using existing good practices followed by LLM providers and a practical example for SE researchers in the context of test case generation.

KW - Large Language Models

KW - Artificial Intelligence

KW - Empirical Software Engineering

KW - Empirical Software Validation

M3 - Conference contribution

BT - ACM/IEEE 46th International Conference on Software Engineering - New Ideas and Emerging Results

PB - ACM/IEEE

T2 - ACM/IEEE 46th International Conference on Software Engineering

Y2 - 14 April 2024 through 20 April 2024

ER -

Breaking the Silence: the Threats of Using LLMs in Software Engineering

Abstract

Conference

Keywords

Fingerprint

Cite this