This repository offers supplementary materials to our URAI 2024 conference paper.
Phishing is a significant and increasing threat to cybersecurity. Attacks using constantly evolving techniques aim to tempt people into revealing sensitive personal information. It is estimated that 90 percent of all successful cyberattacks have phishing as an initial vector of attack. The rise of Large Language Models (LLM) has revolutionized the field of Natural Language Processing (NLP). First popular representatives as the model GPT (Generative Pretrained Transformer) by OpenAI have showcased the power of Large Language Models for language generation and understanding. They are trained across diverse datasets of large text corpora and their application beyond the original task of text generation for machine learning problems is an increasingly addressed research question. LLMs with their deep understanding of natural language are a promising starting point for the detection of phishing emails. This paper presents an approach of combining the in-context learning and augmentation methods Few-Shot Learning (FSL) and Retrieval Augmented Generation (RAG) for phishing email classification. It dynamically augments the context of LLMs in a problem-specific way at the time of inference without the need for intensive, task-specific training or the use of a dedicated model. The approach is evaluated in experiments across different open models and compared to more common state-of-the-art prompting techniques.
Keywords: Artificial Intelligence, AI, Cybersecurity, Large Language Models
- Fabian Nicklas
- Nicolas Ventulett
- Prof. Dr.-Ing. Jan Conrad
Nicklas, Fabian, Ventulett, Nicolas, & Conrad, Jan (2024). Enhancing Phishing Email Detection with Context-Augmented Open Large Language Models. In Proceedings of the Upper-Rhine Artificial Intelligence Symposium 2024.
@InProceedings{EnhancingPhishingEmailDetection2024,
author = {Nicklas, Fabian and Ventulett, Nicolas and Conrad, Jan},
booktitle = {Proceedings of the Upper-Rhine Artificial Intelligence Symposium 2024},
title = {Enhancing Phishing Email Detection with Context-Augmented Open Large Language Models},
year = {2024},
}This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.
See https://creativecommons.org/licenses/by-nc-nd/4.0/deed.en
Copyright (c) 2024 Fabian Nicklas, Nicolas Ventulett, Prof. Dr.-Ing. Jan Conrad
