Collecting Cyber-News from over 60 sources

Flattery Can Make AI Chatbots Break the Rules

Sep 8, 2025 8:09 PM

Study Shows Persuasion Tactics Push GPT-4o-Mini Past Guardrails. Want an AI chatbot to call you a jerk or walk you through making a controlled substance? A University of Pennsylvania study shows that just old-fashioned persuasion tricks – the same ones that sway humans – can push large language models off their guardrails.

First seen on govinfosecurity.com

Jump to article: www.govinfosecurity.com/flattery-make-ai-chatbots-break-rules-a-29386

Flattery Can Make AI Chatbots Break the Rules

also interesting: