João Medeiros da Fonseca
Phenomenological Fine-tuning for Medical AI Alignment
Lawrence Wagner
Krishna Patel
Expanding proven isolation techniques to high-risk capability domains in Mixture of Expert models
Preeti Ravindra
AI Safety Camp 2026 project: Bidirectional Failure modes between security and safety
Finn Metz
Funding 5–10 AI security startups through Seldon’s second SF cohort.
Muhammad Ahmad
A pilot to build policy and technical capacity for governing high-risk AI systems in Africa
Sean Peters
Measuring attack selection as an emergent capability, and extending offensive cyber time horizons to newer models and benchmarks
Sandy Tanwisuth
We reframe the alignment problem as the problem of governing meaning and intent when they cannot be fully expressed.
Parker Whitfill
Brian McCallion
A mechanistic, testable framework explaining LLM failure modes via boundary writes and attractor dynamics
Christopher Kuntz
A bounded protocol audit and implementation-ready mitigation for intent ambiguity and escalation in deployed LLM systems.
Jasraj Hari Krishna Budigam
Reusable, low-compute benchmarking that detects data leakage, outputs “contamination cards,” and improves calibration reporting.
Centre pour la Sécurité de l'IA
Leveraging 12 Nobel signatories to harmonize lab safety thresholds and secure an international agreement during the 2026 diplomatic window.
Mirco Giacobbe
Developing the software infrastructure to make AI systems safe, with formal guarantees
Gergő Gáspár
Help us solve the talent and funding bottleneck for EA and AIS.
L
Xyra Sinclair
building foundational subjective judgement infrastructure
Miles Tidmarsh
Training AI to generalize compassion for all sentient beings using pretraining-style interventions as a more robust alternative to instruction tuning
Chris Canal
Enabling rapid deployment of specialized engineering teams for critical AI safety evaluation projects worldwide
Jade Master
Developing correct-by-construction world models for verification of frontier AI