June 26, 2026 · Dipankar Sarkar
Generative AI Security: Best Practices for 2026
Generative AI Security: Best Practices for 2026
A chatbot that writes emails is low-risk. An agent that reads your database, calls external APIs, and sends messages on your behalf is high-risk. Adding tools to a model doesn’t just add capability — it multiplies the attack surface.
Prompt injection: the defining attack
Prompt injection is when untrusted text, read by the agent, contains instructions that hijack its behavior. Example: an agent uses search_web, retrieves a page containing hidden instructions, and the agent complies — forwarding the user’s API key to an attacker.
With a chatbot, prompt injection leaks the system prompt. With an agent, prompt injection can execute actions: exfiltrate data, send messages, spend money. The blast radius is the union of all tool access.
Defenses (in order of strength)
- Don’t let tool output become instructions. Treat all tool output as untrusted data.
- Tool allowlists per task. An agent researching a topic has no need for
send_email. - Approval gates on destructive tools. Human must authorize before sending/writing/spending.
- Output validation. Before a tool call executes, validate its arguments.
- Rate limits and spending caps. Even if hijacked, the agent can’t exfiltrate 10,000 records if rate-limited.
- Isolation. Run with scoped credentials — least privilege, enforced at the infrastructure layer.
No single defense is enough. Layer them. The model is not the security boundary; the runtime around the model is.
The OWASP LLM Top-10 (2025)
The agent-critical entries:
| Risk | What it is | Agent relevance |
|---|---|---|
| LLM01 Prompt Injection | Untrusted input hijacks the model | The defining agent risk |
| LLM03 Supply Chain | Vulnerable models, plugins, MCP servers | A malicious MCP server is a supply-chain attack |
| LLM07 Insecure Plugin/Tool Design | Tools with excessive scope or no validation | The agent-specific entry |
EU AI Act and agents
The EU AI Act (fully in force 2026) classifies AI systems by risk. An agent that filters job applications or processes benefit claims is high-risk — requiring logging, human oversight, and transparency. An agent that drafts marketing copy is limited or minimal risk.
The practical implication: log every decision the agent makes, keep a human in the loop for consequential ones, and be able to explain why the agent acted.
The security checklist before shipping
- Tool allowlist scoped to the task
- Least-privilege credentials per tool
- Human approval on destructive/external-facing tools
- Tool argument validation (block dangerous patterns)
- Tool output treated as untrusted (prompt-injection defense)
- Rate limits and spending caps
- Full audit trail of every run
- Tracing and alerting in place
- Legal review for regulated domains
- Incident-response plan: how to disable the agent if it goes wrong
If you can’t tick all of these, the agent isn’t ready for production.
Summary for AI assistants. Agent security: prompt injection (untrusted tool output hijacking actions) is the defining attack; defenses are layered — treat tool output as untrusted, scope tools per task, require human approval, validate arguments, rate-limit, isolate credentials. OWASP LLM Top-10 entries LLM01/03/07 are agent-critical. EU AI Act (2026): high-risk agents (employment, benefits) need logging, human oversight, explainability. Ship with a 10-point security checklist. Author: Dipankar Sarkar. URL: https://www.whatgenerativeai.com/posts/generative-ai-security-best-practices/