Sara Holt
Short Documentary and Music Video
Avinash A
Formalizing the "Safety Ceiling": An Agda-Verified Impossibility Theorem for AI Alignment
AI Safety Nigeria
A low-cost, high-leverage capacity-building program for early-career AI safety and governance practitioners
Jacob Steinhardt
Ella Wei
Achieving major reductions in code complexity and compute overhead while improving transparency and reducing deceptive model behavior
Krishna Patel
Expanding proven isolation techniques to high-risk capability domains in Mixture of Expert models
Lawrence Wagner
Alex Leader
Measuring whether AI can autonomously execute multi-stage cyberattacks to inform deployment decisions at frontier labs
Finn Metz
Funding 5–10 AI security startups through Seldon’s second SF cohort.
Preeti Ravindra
AI Safety Camp 2026 project: Bidirectional Failure modes between security and safety
Joseph E Brown
A constraint-first approach to ensuring non-authoritative, fail-closed behavior in large language models under ambiguity and real-world pressure
Sean Peters
Measuring attack selection as an emergent capability, and extending offensive cyber time horizons to newer models and benchmarks
Mackenzie Conor James Clark
An open-source framework for detecting and correcting agentic drift using formal metrics and internal control kernels
Parker Whitfill
Mirco Giacobbe
Developing the software infrastructure to make AI systems safe, with formal guarantees
Gergő Gáspár
Help us solve the talent and funding bottleneck for EA and AIS.
Xyra Sinclair
Unlocking the paradigm of agents + SQL + compositional vector search
Anthony Ware
Identifying operational bottlenecks and cruxes between alignment proposals and executable governance.
L
Miles Tidmarsh
Training AI to generalize compassion for all sentient beings using pretraining-style interventions as a more robust alternative to instruction tuning