Resource

LLM Application Security Checklist

4 min read16-point checklist

An application built on a large language model inherits the entire web stack's threat surface and then adds a new one: a non-deterministic component that treats data as instructions. Because the model reads attacker-influenced text, calls tools, and often runs with broad backend privileges, untrusted input can now reach places that traditional input validation never had to defend.

The LLM is a new trust boundary

In a conventional application, you can reason about where untrusted data enters and where it is sanitized before it touches a database, a shell, or a browser. An LLM collapses that clean separation, because to the model there is no meaningful distinction between the developer's system prompt, the user's request, and a paragraph retrieved from a web page. All of it arrives as tokens, and any of it can steer behavior. Treat every input path into the model — user messages, retrieved documents, tool outputs, even file metadata — as an untrusted channel.

The practical consequence is that you must place a trust boundary around the model itself, not just at the edge of your application. Treat the model's output as untrusted by default, the same way you treat its input. The model is a powerful but credulous intermediary sitting between your users and your backend systems, and the security of the application depends on what you let cross those boundaries, in both directions.

Prompt injection and insecure output handling

Prompt injection (OWASP LLM01) is the signature risk. Direct injection is a user telling the model to ignore its instructions; indirect injection is more dangerous, where malicious instructions hide inside content the model ingests — a support ticket, a scraped page, a PDF — and hijack the model when it processes them. There is no known complete defense; you reduce blast radius by constraining what the model can do with a successful injection rather than assuming you can block every injection.

That blast radius is governed largely by how you handle output (OWASP LLM05: Improper Output Handling). If model output is passed downstream without validation, you reintroduce classic vulnerabilities: text rendered into a page becomes XSS, output interpolated into a query becomes SQL injection, and generated code or shell commands become remote code execution. The fix is conventional — contextually encode, validate, and parameterize everything the model emits before it reaches a browser, an interpreter, or a database — but it must be applied rigorously because the source is now adversary-influenced.

Sensitive information disclosure and system prompt leakage

LLM applications leak data in ways that are easy to overlook (OWASP LLM02). A model can surface PII, secrets, or proprietary content that appeared in its context window, its training data, or a retrieval store shared across tenants. A common failure is stuffing a retrieval pipeline with documents the current user is not authorized to see, then trusting the model to "only answer appropriately" — it will not reliably do so. Filter and authorize at the retrieval layer, before content enters the context.

Closely related is OWASP LLM07: System Prompt Leakage, a 2025 addition. The risk is not merely that an attacker can coax out your system prompt; it is that teams place secrets, credentials, or security logic inside the prompt and treat it as confidential. Assume the system prompt is recoverable, and never let it hold anything whose disclosure causes harm. Authorization and secret management must live in your application code and infrastructure, not in instructions you hand to the model.

Excessive agency, the confused deputy, and access control

The fastest-growing risk in agentic applications is OWASP LLM06: Excessive Agency — granting the model more functionality, permissions, or autonomy than the task requires. When the model can call tools that delete records, send email, or move money, a successful injection becomes an action, not just a bad answer. This is a classic confused-deputy problem: the model acts with the application's elevated backend privileges rather than the privileges of the specific user making the request.

The defense is to push authorization out of the model and into deterministic enforcement points. Scope tools narrowly, prefer read-only or low-risk operations, and require human confirmation for high-impact actions. Critically, every tool call should execute with the requesting user's identity and permissions, checked by your own access-control layer — not with a broad service credential the model can aim wherever an attacker directs it. The model proposes; your code, holding the real authority, disposes.

Data exposure to providers and unbounded consumption

Most production LLM apps send prompts to a hosted provider, which makes data governance a first-order concern. Anything placed in the context window leaves your trust boundary and is processed by a third party. Reputable enterprise providers (Anthropic, OpenAI, Google, Microsoft) offer commercial terms that exclude API inputs and outputs from model training and define retention windows, but the obligation to minimize, classify, and avoid sending unnecessary sensitive data remains yours. Read the data-handling terms for the specific tier you use, and treat consumer and enterprise plans as different risk profiles.

Finally, the metered, compute-heavy nature of these systems creates OWASP LLM10: Unbounded Consumption. Without limits, an attacker can drive denial of service or "denial of wallet" by flooding your endpoint, submitting oversized prompts, or coercing expensive multi-step loops that run up your provider bill. Enforce rate limits, token and context caps, per-user quotas, timeouts, and budget alerts so resource use is bounded by design.

Key takeaway

Treat the model as an untrusted, credulous component — validate everything it reads and emits, enforce authorization and spending limits in your own code, and never grant it more privilege than the user it is acting for.

Checklist

The LLM Application Security checklist

A practical, copy-ready list to run against your own codebase, pipeline, and AI usage.

Trust boundaries

  • Treat all model output as untrusted input to whatever consumes it downstream.
  • Separate system instructions from user content using message roles; never concatenate raw user text into the system prompt.
  • Treat retrieved and third-party content (web pages, documents, emails) as untrusted.

Input and output validation

  • Validate model output against a strict schema before acting on it.
  • Enforce allowlists for any action, URL, or identifier derived from model output.
  • Constrain the output format with JSON mode or structured outputs where the provider supports it.

Data exposure and privacy

  • Minimize sensitive data sent to external providers; redact PII and secrets from prompts.
  • Confirm the provider's data-retention and training settings match your privacy commitments.
  • Avoid logging full prompts and responses that contain sensitive data, and scrub logs that do.

Abuse and cost controls

  • Authenticate and rate-limit AI endpoints — never leave them open to the public.
  • Set per-user quotas and maximum token and output caps to prevent runaway cost abuse.
  • Add timeouts and circuit breakers around model calls.

Access control

  • Enforce the user's own permissions on any data the model retrieves or acts on, so the model cannot escalate privileges.
  • Scope tool and function access to the authenticated user's rights.

Monitoring

  • Log tool calls and high-risk actions for audit.
  • Monitor for anomalous usage, jailbreak patterns, and error spikes.

This checklist is general guidance, not a guarantee of security. A repo audit applies these checks to your actual codebase, dependencies, and AI usage and returns prioritized findings.

Want these checks run on your repository?

Book a repo audit to get prioritized findings for your codebase, LLM usage, prompts, agents, RAG, MCP tools, dependencies, secrets, containers, and infrastructure.

Book an Audit