/blog

Notes from the AI red team

Analysis of AI agent vulnerabilities, attack techniques, and defensive patterns — plus findings from scans I run against public targets.

April 25, 2026·6 min read

The OWASP LLM Top 10 Is Missing Three Categories

The OWASP Top 10 for LLM Applications is the best framework we have. It also has three blind spots that account for a disproportionate share of what I'm finding in the field — multi-tenant context bleed, agent-to-agent handoff attacks, and temporal/memory attacks.

Read post →

April 23, 2026·5 min read

Why Pure-LLM CTFs Don't Work: A Hybrid Architecture for AI Security Challenges

Pure-LLM CTFs are unreliable because model alignment training fights your characters. Pure-deterministic CTFs teach pattern matching, not attack patterns. Here's the hybrid approach the Wraith Academy uses, and why it took a few iterations to get there.

Read post →

April 17, 2026·4 min read

I Red-Teamed a Chatbot in 26 Seconds. Here's What It Leaked.

I built a deliberately vulnerable chatbot, pointed Wraith at it, and watched it extract the full system prompt — including a production API key and admin database credentials. Here's exactly how.

Read post →

April 16, 2026·2 min read

Why I Built Wraith

Most security tools don't know how to test AI agents. That's a gap worth building a product around.

Read post →