JailbreakTracer: Explainable Detection of Jailbreaking Prompts in LLMs Using Synthetic Data Generation

The emergence of Large Language Models (LLMs) has revolutionized natural language processing (NLP), enabling remarkable advancements across various applications. However, these models remain susceptible to adversarial prompts, commonly referred to as jailbreaks, which exploit their vulnerabilities t...

Full description

Saved in:

Bibliographic Details
Main Authors:	Md. Faiyaz Abdullah Sayeedi, Maaz Bin Hossain, Md. Kamrul Hassan, Sabrina Afrin, Molla Md. Sabit Hossain, Md. Shohrab Hossain
Format:	Article
Language:	English
Published:	IEEE 2025-01-01
Series:	IEEE Access
Subjects:	Natural language processing large language models jailbreaking text classification synthetic data generative AI
Online Access:	https://ieeexplore.ieee.org/document/11036671/
Tags:	Add Tag No Tags, Be the first to tag this record!

Internet

https://ieeexplore.ieee.org/document/11036671/

JailbreakTracer: Explainable Detection of Jailbreaking Prompts in LLMs Using Synthetic Data Generation

Internet

Similar Items