TY - GEN
T1 - An Attacker's Dream? Exploring the Capabilities of ChatGPT for Developing Malware
AU - Pa Pa, Yin Minn
AU - Tanizaki, Shunsuke
AU - Kou, Tetsui
AU - Van Eeten, Michel
AU - Yoshioka, Katsunari
AU - Matsumoto, Tsutomu
N1 - Green Open Access added to TU Delft Institutional Repository ‘You share, we take care!’ – Taverne project https://www.openaccess.nl/en/you-share-we-take-care Otherwise as indicated in the copyright section: the publisher is the copyright holder of this work and the author uses the Dutch legislation to make this work public.
PY - 2023
Y1 - 2023
N2 - We investigate the potential for abuse of recent AI advances by developing seven malware programs and two attack tools using ChatGPT, OpenAI Playground's "text-davinci-003"model, and Auto-GPT - an open-source AI agent capable of generating automated prompts to accomplish user-defined goals. We confirm that: 1) Under the safety and moderation control of recent AI systems, it is possible to generate the functional malware and attack tools (up to about 400 lines of code) within 90 minutes, including the debugging time. 2) Auto-GPT does not ease the hurdle of generating the right prompts for malware generation, but it evades the safety controls of OpenAI with its automatically generated prompts. When given goals with sufficient details, it writes the code in nine of nine malware and attack tools we tested. 3) There is still room to improve the moderation and safety controls of ChatGPT and text-davinci-003 model, especially for the growing jailbreak prompts. Overall, we find that recent AI advances, including ChatGPT, Auto-GPT, and text-davinci-003, demonstrate the potential for generating malware and attack tools under safety and moderation control, highlighting the need for improved safety measures and enhanced safety controls in AI systems.
AB - We investigate the potential for abuse of recent AI advances by developing seven malware programs and two attack tools using ChatGPT, OpenAI Playground's "text-davinci-003"model, and Auto-GPT - an open-source AI agent capable of generating automated prompts to accomplish user-defined goals. We confirm that: 1) Under the safety and moderation control of recent AI systems, it is possible to generate the functional malware and attack tools (up to about 400 lines of code) within 90 minutes, including the debugging time. 2) Auto-GPT does not ease the hurdle of generating the right prompts for malware generation, but it evades the safety controls of OpenAI with its automatically generated prompts. When given goals with sufficient details, it writes the code in nine of nine malware and attack tools we tested. 3) There is still room to improve the moderation and safety controls of ChatGPT and text-davinci-003 model, especially for the growing jailbreak prompts. Overall, we find that recent AI advances, including ChatGPT, Auto-GPT, and text-davinci-003, demonstrate the potential for generating malware and attack tools under safety and moderation control, highlighting the need for improved safety measures and enhanced safety controls in AI systems.
KW - AI generated malware
KW - Auto-GPT abuses
KW - ChatGPT abuses
UR - http://www.scopus.com/inward/record.url?scp=85171441483&partnerID=8YFLogxK
U2 - 10.1145/3607505.3607513
DO - 10.1145/3607505.3607513
M3 - Conference contribution
AN - SCOPUS:85171441483
T3 - ACM International Conference Proceeding Series
SP - 10
EP - 18
BT - Proceedings of CSET 2023 - 16th Cyber Security Experimentation and Test Workshop
PB - Association for Computing Machinery (ACM)
T2 - 16th Cyber Security Experimentation and Test Workshop, CSET 2023
Y2 - 7 August 2023
ER -