5 steps for deploying agentic AI red teaming

Five steps to take towards implementing agentic red teaming:

1. Change your attitude

Perhaps the biggest challenge for agentic red teaming is adjusting your perspective in how to defend your enterprise. “The days where database admins had full access to all data are over,” says Suer. “We need to have a fresh attitude towards data and fully understand its business relevance.” As an example, a common pen testing tool such as Burp Suite can be used to detect model inputs and outputs that are misused by an AI model, suggested Brauchler. “The context is key, and Burp can still be used to automate testing for jailbroken agent behaviors, such as what happened with the Crescendo attack.”Kurt Hoffman, head of the application security department of Blizzard Entertainment, tells CSO that AI agents are “really just a force multiplier and are a skilled addition to existing pen testing, but not a replacement. You should use AI agents to do the tedious and boring parts of red teaming and use humans to find creative and novel attack approaches. This is because agents always work best in tandem with humans. AI agents have the ability to scale up attacks to levels we have never seen before.”Part of that attitude is to look at agentic defense differently. “We need to test how humans actually use gen AI systems,” AI strategist Kate O’Neill tells CSO. “Most real-world AI security failures happen not because someone hacked the agent, but because users developed blind spots either over-trusting capabilities that aren’t there or finding workarounds that bypass safety measures entirely. Red teaming is necessary but not sufficient. The most effective programs I’ve seen combine traditional security testing with participatory design sessions and stakeholder impact mapping. You want to understand not just ‘can we break this?’ but ‘who gets hurt when this works exactly as designed?’”Another depressing thought: “It is like fighting a tidal wave with a squire gun, because you are looking at the symptoms and not treating the disease,” said Brauchler.

2. Know and continually test your guardrails and governance

Many of the agentic-based exploits find clever ways to maneuver around various security guardrails to encourage malicious behavior. The CSA report goes into almost excruciating details about how these exploits work, what prompts can be used to circumvent things, and how you can try to avoid them.”Understanding where you need to place these guardrails, either in the cloud or in your workflows or both, is critical. You need to do the appropriate testing before you release any AI agents into production, and have the necessary governance and controls and observability, especially as your environment can change dynamically,” Gartner analyst Tom Coshow tells CSO.One effort worth considering is Forrester’s Agentic AI Guardrails for Information Security (AEGIS). It covers governance, data and app security and layers in a zero-trust architecture in other words, quite a lot to take into account.

3. Widen your base for team members

One small glimmer of hope is that organizations can use a wider skill base for their red teams. “An AI red teamer just needs to know English, or whatever language is being tested. Even a college history major can use language to manipulate a model’s behavior,” said Pangea’s Melo.

4. Widen the solution space

“Just remember,” CalypsoAI president James White tells CSO. “There is no threat to a running gen AI model until you ask it a question. But agents can get around this, because agents can find almost limitless ways to break the typical chronological causation chain.” This means casting a wider net to understand what is happening across your organization. Break the historical habits of this causation chain and see the potential threats as parts of a whole.”AI is no longer just a tool; it is a participant in systems, a co-author of code, a decision-maker, and increasingly, an adversary,” wrote RADware’s director of threat intel Pascal Geenens in a report. “From the adversary’s point of view, however, the game has changed”, and the odds are in their favor. They’re no longer limited by time, talent, or budget.”As O’Neill says: “The CSA report gives you the technical foundation; the human-centric piece is what turns that into a program that prevents harm in the real world.”

5. Consider the latest tools and techniques

Building secure agentic systems requires more than just securing individual components; it demands a holistic approach where security is embedded within the architecture itself, according to OWASP. To that end, it lists several development tools (some of which are open-source projects) that can be used to craft and launch red teaming workflows, such as AgentDojo, SPLX’s Agentic Radar, Agent SafetyBench and HuggingFace’s Fujitsu benchmarking data set. And more recently, Solo.io released its Agentgateway project which is an open-source tool to monitor agent-to-agent communications.There are other commercial tools that can help to construct and automate red teaming, including:
CalypsoAI.com has its Inference Platform that includes agentic red teaming. Their head of product, Kim Bieler, tells CSO that there are three times when red teaming is critical: during the model development, during the larger application development process, and pre-production of any finished code.Crowdstrike AI Red Team Services includes agentic red teaming features, along with a full set of other AI protection.SPLX has its AI Platform that runs large-scale risk assessments across generative AI infrastructure and simulates thousands of interactions with various automated red-teaming methods.Microsoft has integrated its AI Red Team’s open-source toolkit Python Risk Identification Tool into Azure AI Foundry, which can simulate the behavior of an adversarial user and does automated scans and evaluates the success of its probes.Salesforce has its own automated red teaming framework for its applications infrastructure.HiddenLayer has its own agentic red team automation tool.One final note comes from Susanna Cox, who wrote in her blog: “AI agents are different. The attack surface is unlike any AI system we’ve seen before in many ways. And they’re being given permissions that no software system in history has been trusted with before, with good reason. Agent architecture determines the attack surface.”

First seen on csoonline.com

Jump to article: www.csoonline.com/article/4055224/5-steps-for-deploying-agentic-ai-red-teaming.html