Exploring Security Challenges in Agentic Autonomy Levels

Rebeca Moen
Feb 26, 2025 02:06

NVIDIA’s framework addresses security risks in autonomous AI systems, highlighting vulnerabilities in agentic workflows and suggesting mitigation strategies.

As artificial intelligence continues to evolve, the development of agentic workflows has emerged as a pivotal advancement, enabling the integration of multiple AI models to perform complex tasks with minimal human intervention. These workflows, however, bring inherent security challenges, particularly in systems using large language models (LLMs), according to NVIDIA’s insights shared on their blog.

Understanding Agentic Workflows and Their Risks

Agentic workflows represent a step forward in AI technology, allowing developers to link AI models for intricate operations. This autonomy, while powerful, also introduces vulnerabilities, such as the risk of prompt injection attacks. These occur when untrusted data is introduced into the system, potentially allowing adversaries to manipulate AI outputs.

To address these challenges, NVIDIA has proposed an Agentic Autonomy framework. This framework is designed to assess and mitigate the risks associated with complex AI workflows, focusing on understanding and managing the potential threats posed by such systems.

Manipulating Autonomous Systems

Exploiting AI-powered applications typically involves two elements: the introduction of malicious data and the triggering of downstream effects. In systems using LLMs, this manipulation is known as prompt injection, which can be direct or indirect. These vulnerabilities arise from the lack of separation between the control and data planes in LLM architectures.

Direct prompt injection can lead to unwanted content generation, while indirect injection allows adversaries to influence the AI’s behavior by altering the data sources used in retrieval augmented generation (RAG) tools. This manipulation becomes particularly concerning when untrusted data leads to adversary-controlled downstream actions.

Security and Complexity in AI Autonomy

Even before the rise of ‘agentic’ AI, orchestrating AI workloads in sequences was common. As systems advance, incorporating more decision-making capabilities and complex interactions, the number of potential data flow paths increases, complicating threat modeling.

NVIDIA’s framework categorizes systems by autonomy levels, from simple inference APIs to fully autonomous systems, helping to assess the associated risks. For instance, deterministic systems (Level 1) have predictable workflows, whereas fully autonomous systems (Level 3) allow AI models to make independent decisions, increasing the complexity and potential security risks.

Threat Modeling and Security Controls

Higher autonomy levels do not necessarily equate to higher risk but do signify less predictability in system behavior. The risk is often tied to the tools or plugins that can perform sensitive actions. Mitigating these risks involves blocking malicious data injection into plugins, which becomes more challenging with increased autonomy.

NVIDIA recommends security controls specific to each autonomy level. For instance, Level 0 systems require standard API security, while Level 3 systems, with their complex workflows, necessitate taint tracing and mandatory data sanitization. The goal is to prevent untrusted data from influencing sensitive tools, thereby securing the AI system’s operations.

Conclusion

NVIDIA’s framework provides a structured approach to assessing the risks associated with agentic workflows, emphasizing the importance of understanding system autonomy levels. This understanding aids in implementing appropriate security measures, ensuring that AI systems remain robust against potential threats.

For more detailed insights, visit the NVIDIA blog.

Image source: Shutterstock