Agent Threat Rules (ATR) — Open Runtime Detection Standard for AI Agents

Seeking regrant from Marius Hobbhahn (Apollo), Joel Becker (METR), Ethan Perez (Anthropic), Richard Ngo

SHORT SUMMARY

MIT-licensed open detection rule corpus catching prompt injection, tool poisoning, agent manipulation, and supply-chain compromise in deployed AI agents. 330 plus rules, already in production at Microsoft and Cisco, 97.1 percent recall on NVIDIA garak benchmark. Built solo in 60 days. Funding locks the open standard before commercial vendors fragment the ecosystem.

WHY THIS MATTERS FOR AI SAFETY

Pre-deployment evals cannot catch every emergent behavior once agents are live. Runtime detection is the only remaining control surface in production.

ATR is one of the only two open, community-governed runtime detection projects worldwide. Closed vendors have raised over USD 280M but keep rules proprietary. An open commons (like YARA for malware or Falco for cloud) lets safety improvements reach all deployments instantly.

Direct x-risk angle: in scheming or takeover scenarios, misaligned agents will manipulate tools, poison memory, and pursue hidden goals in production. ATR provides the open, vendor-neutral detection layer that gives defenders real-time visibility.

EMPIRICAL RESULTS

97.1 percent recall on NVIDIA garak inthewild_jailbreak_llms (666 samples)

0.20 percent false positive on benign skills (498 samples)

96K plus production skills scanned, 751 malicious instances catalogued

100 percent NIST AI RMF v2.1.0 compliance mapping

PRODUCTION DEPLOYMENT

Cisco AI Defense: full rule pack merged (PR 79 + PR 99)

Microsoft Agent Governance Toolkit: 287 rules + weekly auto-sync merged (PR 908 + PR 1277)

Active integrations: NVIDIA garak (PR 1676), Gen Digital Sage (PR 33), IBM mcp-context-forge (PR 4109)

npm: roughly 23K downloads per month combined

WHAT FUNDING ENABLES (6 MONTHS)

Expand corpus from 330 to 500 plus rules with focus on multi-agent attack patterns. Complete 5-framework compliance mapping (EU AI Act, NIST AI RMF, ISO 42001, OWASP Agentic, OWASP LLM). Commission external security audit. Onboard 2 additional maintainers with commit rights for governance bus-factor reduction. Run public RFC process and quarterly community calls. Expand open-core Migrator format adapters (Falco, Splunk-SPL, Wazuh, Elastic-ECS, Suricata).

Funding is USD 30K minimum to USD 75K target. If only USD 30K is funded I prioritize corpus expansion, compliance mapping, and audit; defer governance and translation. If USD 75K I complete all 6 deliverables on the 6-month timeline.

MAINTAINER

LIN, KUAN-HSIN (Adam Lin), Taiwan. Solo independent maintainer. No PhD, no institutional affiliation. Cross-disciplinary background: real estate sales (USD 1M in 3 months at age 27), content marketing (300M Threads impressions in 3 months), Taiwan's longest-running hip-hop music festival 5th year. Pivoted to AI agent security 60 days ago after observing distilled-LLM weaponization for information warfare.

NCCoE Community of Interest member confirmed 2026-05-09. NIST CAISI direct outreach via regulations.gov RFI 2026-05-09. Audrey Tang (Taiwan ex-digital-minister) follows on Threads.

LINKS

Repo: github.com/Agent-Threat-Rule/agent-threat-rules

npm: agent-threat-rules v2.1.0

Migrator: npmjs.com/package/@panguard-ai/migrator-community v0.1.0

DOI: 10.5281/zenodo.19178002

Public ecosystem map: sovereign-ai-defense.vercel.app

NIST AI RMF mapping page: agentthreatrule.org/en/compliance/nist-ai-rmf

GitHub: eeee2345 (also Adamthereal)

Email: adam@agentthreatrule.org

CONFLICTS

I am also founder of PanGuard, a commercial implementation of ATR (open-core model: Snyk plus open scanner, HashiCorp plus open Terraform precedent). ATR rules are MIT licensed in perpetuity per public GOVERNANCE.md. PanGuard revenue is targeted but not yet realized. This regrant funds the open ATR work, not PanGuard product development. YC S26 application submitted Day 54.

PARALLEL FUNDING PURSUED

LTFF (AI safety angle), ARM Fund (x-risk infrastructure), NLnet NGI Zero Commons (open digital commons), GitHub Secure Open Source Fund (maintainer security program), OpenSSF Alpha-Omega (AI-driven threat fixes), Schmidt Sciences Trustworthy AI Tier 1 individual researcher track. Compute credits via Anthropic External Researcher and Microsoft Founders Hub. Manifund regrant is the fastest cash path; most other funders have 30-90 day cycles.

Agent Threat Rules (ATR) — Open Runtime Detection Standard for AI Agents

Offer to donate