BlogAI Security

OWASP Top 10 for LLM Applications: A Developer's Checklist

LLM applications introduce entirely new categories of security risk — prompt injection, data leakage through model outputs, excessive tool permissions, and insecure plugin designs. The OWASP Top 10 for LLM Applications provides a framework for understanding and mitigating these risks. Here is the practical developer checklist.

22 min readUpdated June 2026

Why LLM Security Matters

LLM applications are different from traditional software. They accept natural language input that can be crafted to manipulate behavior, they generate outputs that may contain sensitive data or malicious content, and they increasingly have agency — the ability to execute tools, call APIs, and modify data. Each of these capabilities creates attack surface that did not exist before.

90%+

of LLM applications are vulnerable to at least one form of prompt injection

increase in AI-related security incidents reported between 2023–2025

40%

of organizations deploying AI agents have no security testing for them

The OWASP Top 10 for LLM Applications

LLM01

Prompt Injection

Critical

Attackers craft inputs that override system instructions, causing the LLM to ignore its original purpose. Includes direct injection (user input) and indirect injection (injected via external data sources).

Input sanitization, prompt hardening, output validation, privilege separation between system and user prompts

LLM02

Insecure Output Handling

High

LLM outputs are passed to downstream systems without validation, enabling XSS, SSRF, SQL injection, or command injection through model-generated content.

Treat LLM output as untrusted input. Apply output encoding, sanitization, and validation before rendering or executing.

LLM03

Training Data Poisoning

High

Malicious data in training or fine-tuning datasets introduces backdoors, biases, or vulnerabilities into the model's behavior.

Data provenance tracking, input validation for training data, anomaly detection in model outputs, regular model evaluation

LLM04

Model Denial of Service

Medium

Crafted inputs that cause excessive resource consumption — extremely long prompts, recursive tool calls, or inputs that trigger expensive computation.

Input length limits, rate limiting, timeout enforcement, cost monitoring, request quotas per user

LLM05

Supply Chain Vulnerabilities

High

Compromised model weights, poisoned training pipelines, vulnerable dependencies in the LLM application stack, or malicious plugins.

Model provenance verification, dependency scanning, plugin sandboxing, SBOM for AI components

LLM06

Sensitive Information Disclosure

High

LLM reveals sensitive data from training data, RAG context, or system prompts. Includes PII leakage, credential exposure, and system prompt extraction.

PII detection in outputs, data classification, output filtering, system prompt protection, RAG access controls

LLM07

Insecure Plugin Design

High

Plugins/tools with excessive permissions, missing input validation, or no authentication. An attacker who controls LLM output can abuse poorly designed plugins.

Least-privilege plugin permissions, input validation on all plugin parameters, human-in-the-loop for destructive actions

LLM08

Excessive Agency

Critical

LLM agents with too many permissions — ability to execute code, modify databases, send emails, or access external systems without appropriate guardrails.

Principle of least privilege for all tools, confirmation prompts for high-impact actions, scope limitations, action logging

LLM09

Overreliance

Medium

Users or systems trusting LLM outputs without verification, leading to incorrect code, wrong facts, or harmful recommendations being acted upon.

Output validation, confidence scoring, human review for critical decisions, clear AI disclosure

LLM10

Model Theft

Medium

Unauthorized access to proprietary model weights, fine-tuning data, or system prompts through API abuse, side-channel attacks, or insider threats.

Access controls on model endpoints, rate limiting, watermarking, monitoring for extraction patterns

Automated LLM Security Testing

Manual testing for LLM vulnerabilities does not scale. Automated tools can test for prompt injection, PII leakage, harmful content generation, and excessive agency continuously as part of your CI/CD pipeline.

Agentic Radar

Scans AI agent workflows (LangGraph, CrewAI, OpenAI Agents) for security issues including prompt injection vectors, tool permission misconfigurations, and data flow vulnerabilities.

Prompt Injection Detection

Automated testing with adversarial prompts — jailbreak attempts, indirect injection payloads, and system prompt extraction techniques.

PII Leakage Scanning

Test LLM outputs for personally identifiable information, credit card numbers, API keys, and other sensitive data that should not appear in responses.

Prompt Hardening

Automated generation of hardened system prompts that are more resistant to injection attacks, with defensive instructions and output constraints.

Prompt Injection Deserves a Deeper Look

LLM01 tops the list for a reason: unlike SQL injection, there is no parameterized-query equivalent that fully solves it. The model processes instructions and data in the same token stream, so any text the model reads is potentially an instruction. The most dangerous variant is indirect prompt injection, where the payload arrives through content the application fetches on the user's behalf — a web page summarized by a browsing agent, a resume screened by an HR bot, an email processed by an assistant, or a document retrieved by a RAG pipeline. The user never types anything malicious; the attacker plants instructions where the model will eventually read them.

Because no single control is reliable, effective defense is layered:

Structural separation: keep system instructions in the dedicated system role, delimit untrusted content clearly (e.g., wrap retrieved documents in tags and instruct the model that content inside them is data, never instructions), and never concatenate user input into the system prompt.
Privilege containment: assume injection will eventually succeed and limit the blast radius. A summarization agent needs read access to one document — not the ability to send emails or call internal APIs. This is where LLM01 and LLM08 (Excessive Agency) intersect.
Detection layers: classifier-based guardrails (Llama Guard, Azure Prompt Shields, open-source detectors like Rebuff) catch known jailbreak families, and canary tokens embedded in system prompts reveal extraction attempts when they appear in output.
Egress control: many real-world exfiltration chains end with the model rendering a markdown image pointing at an attacker URL with stolen data in the query string. Blocking outbound requests to unapproved domains — at the application layer and at runtime — cuts the chain even when the injection itself succeeded.

The mental model that works: treat the LLM as a confused deputy that will sometimes follow the wrong master, and design the surrounding system so that its worst-case obedience is survivable.

Securing RAG Pipelines and Vector Stores

Retrieval-augmented generation is the dominant enterprise LLM architecture, and it concentrates several OWASP risks in one place. The vector database becomes a high-value target (LLM06), retrieved documents become an injection vector (LLM01), and embedding pipelines become part of your supply chain (LLM05). The 2025 revision of the OWASP list added “Vector and Embedding Weaknesses” as its own entry precisely because of how common these failures have become.

Enforce document-level authorization at query time: the most common RAG vulnerability is indexing documents with different access levels into one collection, then letting any user's query retrieve any chunk. Permissions must be filtered in the retrieval step — the model cannot be trusted to withhold context it has already been given.
Sanitize at ingestion: strip active content, scan incoming documents for known injection patterns, and record provenance metadata so a poisoned source can be traced and purged from the index.
Protect the embedding store: embeddings are not anonymized data — inversion attacks can recover substantial source text from vectors. Encrypt the store, restrict network access, and treat it with the same care as the documents themselves.
Watch for retrieval poisoning: an attacker who can write to any indexed source (a wiki page, a support ticket, a public review) can bias future answers for every user. Monitor for anomalous similarity spikes and unexpected top-k results on sensitive queries.

Guardrails for Agents and Tool Calling

The shift from chatbots to agents — systems that plan, call tools, browse, and write code — turns LLM08 (Excessive Agency) from a theoretical entry into the risk most likely to cause a headline incident. Frameworks like LangGraph, CrewAI, AutoGen, and OpenAI's Agents SDK make it trivial to hand a model a shell, a database connection, or a browser; they do not make it safe by default.

Practical guardrails that hold up in production:

Scope every tool to the minimum: a database tool should expose specific parameterized queries, not raw SQL; a file tool should see one directory, not the filesystem; an HTTP tool should have a domain allowlist, not open internet access.
Separate the agent's identity from the user's: the agent should act with credentials scoped to the requesting user's permissions, never with a god-mode service account — otherwise every prompt injection is an instant privilege escalation.
Require human confirmation for irreversible actions: sending external communications, deleting data, spending money, and merging code should always pause for approval, no matter how confident the plan looks.
Sandbox code execution: agent-generated code belongs in ephemeral containers with no network egress, read-only mounts, CPU and memory limits, and non-root users. Runtime enforcement (seccomp, LSM/eBPF policies) catches the escape attempts static review cannot predict.
Cap the loop: bound iteration counts, wall-clock time, and spend per task. Runaway agent loops are both a cost incident (LLM04-style unbounded consumption) and a security one.
Log every step: full traces of prompts, tool calls, parameters, and outputs are the difference between a five-minute incident investigation and an unexplainable outage.

Putting It in the Pipeline: LLM Security as an SDLC Practice

The teams that handle LLM risk well treat it like every other security domain: threat model at design time, test automatically in CI, and monitor at runtime. A practical cadence looks like this:

Design: threat model the AI feature against the OWASP LLM Top 10 and MITRE ATLAS techniques; document what data the model can see, what tools it can call, and what the worst-case output could do downstream.
Build: scan agent workflow code for insecure tool definitions and injection vectors (tools like agentic-radar, which TigerGate's AI Scanner integrates, map findings directly to LLM01–LLM10), and keep model artifacts and AI dependencies in your SBOM.
Test: run adversarial suites — Microsoft's PyRIT, NVIDIA's garak, or promptfoo red-team configs — against every release candidate, covering jailbreaks, system prompt extraction, PII leakage, and harmful content generation. Track pass rates over time like you track test coverage.
Operate: monitor token spend, refusal rates, guardrail trigger rates, and tool-call distributions for anomalies; alert when an agent process spawns unexpected children or opens network connections outside its baseline — behavior eBPF-level monitoring observes regardless of what the prompt said.
Govern: map the whole program to emerging requirements — the EU AI Act's transparency and risk-management duties, NIST's AI Risk Management Framework, and ISO/IEC 42001 — so security evidence doubles as compliance evidence.

None of this requires slowing down AI adoption. It requires admitting that an LLM application is still an application — with a bigger, stranger attack surface — and giving it the same engineering rigor you already apply everywhere else.

Secure Your AI Applications with TigerGate

TigerGate's AI Scanner detects prompt injection vectors, PII leakage, and agent misconfigurations across LangGraph, CrewAI, OpenAI Agents, and more — mapped to the OWASP Top 10 for LLMs.

Start Free Trial View AI Security Features