Enhancing Phishing Email Detection with Context-Augmented Open Large Language Models

This repository offers supplementary materials to our URAI 2024 conference paper.

Phishing is a significant and increasing threat to cybersecurity. Attacks using constantly evolving techniques aim to tempt people into revealing sensitive personal information. It is estimated that 90 percent of all successful cyberattacks have phishing as an initial vector of attack. The rise of Large Language Models (LLM) has revolutionized the field of Natural Language Processing (NLP). First popular representatives as the model GPT (Generative Pretrained Transformer) by OpenAI have showcased the power of Large Language Models for language generation and understanding. They are trained across diverse datasets of large text corpora and their application beyond the original task of text generation for machine learning problems is an increasingly addressed research question. LLMs with their deep understanding of natural language are a promising starting point for the detection of phishing emails. This paper presents an approach of combining the in-context learning and augmentation methods Few-Shot Learning (FSL) and Retrieval Augmented Generation (RAG) for phishing email classification. It dynamically augments the context of LLMs in a problem-specific way at the time of inference without the need for intensive, task-specific training or the use of a dedicated model. The approach is evaluated in experiments across different open models and compared to more common state-of-the-art prompting techniques.

Keywords: Artificial Intelligence, AI, Cybersecurity, Large Language Models

Authors

Fabian Nicklas
Nicolas Ventulett
Prof. Dr.-Ing. Jan Conrad

Cite

Nicklas, Fabian, Ventulett, Nicolas, & Conrad, Jan (2024). Enhancing Phishing Email Detection with Context-Augmented Open Large Language Models. In Proceedings of the Upper-Rhine Artificial Intelligence Symposium 2024.

@InProceedings{EnhancingPhishingEmailDetection2024,
  author    = {Nicklas, Fabian and Ventulett, Nicolas and Conrad, Jan},
  booktitle = {Proceedings of the Upper-Rhine Artificial Intelligence Symposium 2024},
  title     = {Enhancing Phishing Email Detection with Context-Augmented Open Large Language Models},
  year      = {2024},
}

License

This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.
See https://creativecommons.org/licenses/by-nc-nd/4.0/deed.en

Name		Name	Last commit message	Last commit date
Latest commit History 17 Commits
llm_prompts		llm_prompts
CITATION.cff		CITATION.cff
README.md		README.md
citation.bib		citation.bib

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Enhancing Phishing Email Detection with Context-Augmented Open Large Language Models

Authors

Cite

License

About

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Enhancing Phishing Email Detection with Context-Augmented Open Large Language Models

Authors

Cite

License

About

Resources

Uh oh!

Stars

Watchers

Forks

Contributors

Uh oh!

Languages