Towards Safe, Secure, and Usable LLMs4Code

Ali Al-Kaswan*

*Corresponding author for this work

Research output: Chapter in Book/Conference proceedings/Edited volumeConference contributionScientificpeer-review

32 Downloads (Pure)

Abstract

Large Language Models (LLMs) are gaining popularity in the field of Natural Language Processing (NLP) due to their remarkable accuracy in various NLP tasks. LLMs designed for coding are trained on massive datasets, which enables them to learn the structure and syntax of programming languages. These datasets are scraped from the web and LLMs memorise information in these datasets. LLMs for code are also growing, making them more challenging to execute and making users increasingly reliant on external infrastructure.We aim to explore the challenges faced by LLMs for code and propose techniques to measure and prevent memorisation. Additionally, we suggest methods to compress models and run them locally on consumer hardware.

Original languageEnglish
Title of host publicationProceedings - 2024 ACM/IEEE 46th International Conference on Software Engineering
Subtitle of host publicationCompanion, ICSE-Companion 2024
PublisherIEEE
Pages258-260
Number of pages3
ISBN (Electronic)9798400705021
DOIs
Publication statusPublished - 2024
EventACM/IEEE 46th International Conference on Software Engineering - Lisbon, Lisbon, Portugal
Duration: 14 Apr 202420 Apr 2024
Conference number: 46
https://conf.researchr.org/home/icse-2024

Publication series

NameProceedings - International Conference on Software Engineering
ISSN (Print)0270-5257

Conference

ConferenceACM/IEEE 46th International Conference on Software Engineering
Abbreviated title ICSE '24
Country/TerritoryPortugal
CityLisbon
Period14/04/2420/04/24
Internet address

Keywords

  • compression
  • data leakage
  • large language models
  • memorisation
  • privacy

Fingerprint

Dive into the research topics of 'Towards Safe, Secure, and Usable LLMs4Code'. Together they form a unique fingerprint.

Cite this