deddy herman hidayat
A privacy-first SLM trained on 2,000+ real teen experiences, optimized to run without internet on low-end smartphones for accessible mental well-being
Alex Leader
Measuring whether AI can autonomously execute multi-stage cyberattacks to inform deployment decisions at frontier labs
Jacob Steinhardt
Joseph E Brown
A constraint-first approach to ensuring non-authoritative, fail-closed behavior in large language models under ambiguity and real-world pressure
Mackenzie Conor James Clark
An open-source framework for detecting and correcting agentic drift using formal metrics and internal control kernels
Krishna Patel
Expanding proven isolation techniques to high-risk capability domains in Mixture of Expert models
Lawrence Wagner
Xyra Sinclair
Unlocking the paradigm of agents + SQL + compositional vector search
Preeti Ravindra
AI Safety Camp 2026 project: Bidirectional Failure modes between security and safety
Finn Metz
Funding 5–10 AI security startups through Seldon’s second SF cohort.
Sean Peters
Measuring attack selection as an emergent capability, and extending offensive cyber time horizons to newer models and benchmarks
Anthony Ware
Identifying operational bottlenecks and cruxes between alignment proposals and executable governance.
Parker Whitfill
Mirco Giacobbe
Developing the software infrastructure to make AI systems safe, with formal guarantees
João Medeiros da Fonseca
Phenomenological Fine-tuning for Medical AI Alignment
Gergő Gáspár
Help us solve the talent and funding bottleneck for EA and AIS.
L
Centre pour la Sécurité de l'IA
Leveraging 12 Nobel signatories to harmonize lab safety thresholds and secure an international agreement during the 2026 diplomatic window.
Miles Tidmarsh
Training AI to generalize compassion for all sentient beings using pretraining-style interventions as a more robust alternative to instruction tuning
Muhammad Ahmad
A pilot to build policy and technical capacity for governing high-risk AI systems in Africa