Kunvar Thaman
$15K
Robert Krzyzanowski
Compute and infrastructure costs
$5K
Ihonwa Omorogieva Sylvester
Making education worthwhile in Nigeria for teachers using AI
$30K
Jesse Hoogland
6-month funding for a team of researchers to assess a novel AI alignment research agenda that studies how structure forms in neural networks
$270K
Ryan Kidd
Help us support more research scholars!
$1M
Apollo Research
Hire 3 additional AI safety research engineers / scientists
$600K
joseph bloom
Trajectory Models and Agent Simulators
$250K
Jeremy Rubinoff
Lucy Farnik
6-month salary for interpretability research focusing on probing for goals and "agency" inside large language models
$38K
Brian Tan
1.9 FTE for 9 months to pilot a training program in Manila exclusively focused on Mechanistic Interpretability
$73.5K
Ethan Josean Perez
4 different projects (finding RLHF alignment failures, debate, improving CoT faithfulness, and model organisms)
Esben Kran
Six-month support for a Program Manager to organize and execute international AI safety hackathons with Apart Research
$50K
Damiano Fornasiere and Pietro Greiner
6 months support for Damiano and Pietro to write a paper about (dis)empowerment with Jon Richens, Reuben Adams, Tom Everitt, and Victoria Krakovna.
Lawrence Chan
3 month
Lisa Thiergart
$244K
Alexander Bistagne
Proving Computational Hardness of Verifying Alignment Desirata
$40K
Rachel Freedman
Allow Berkeley PhD student to devote more time and focus to AI safety research and mentorship.
$58K
Cadenza Labs
We're a team of SERI-MATS alumni working on interpretability, seeking funding to continue our research after our LTFF grant ended.
$292K
Kabir Kumar
Matthew Cameron Farrugia-Roberts
Literature review or other introductory material for Singular Learning Theory targeted at AI alignment applications
Jonas Kgomo
AI Safety lab focusing on technical alignment and governance of AI in Africa and the Global South more broadly. We are a grassroots community-led research lab
$120K
Johnny Lin
Play the Game. Rack Up Points. Save Humanity?
$150K
Annalise Norling
What is the probability that AGI is developed by January 1, 2043?
Ross Nordby
$12K
By porby
John Buridan
$10K
Advances in LLMs are a National Security Risk
AI Worldviews
By Marius Hobbhahn
By Froolow. Parameter Uncertainty in AI Future Forecasting
Toby Pilditch
By TD_Pilditch. Uncertainty, Structure, and Optimal Integration.
By srhoades10
By David Johnston. Why generative models handle seemingly ill-posed problems
Trevor Klee
$1K
By Trevor Klee
By Boaz Barak & Ben Edelman
By Kiel Brennan-Marquez
By Peter S. Park. It is plausible that AGI safety research should be assumed compromised once it is posted on the Internet.
Gabe Mukobi
Evaluating Multipolar Cooperation Failures Between Autonomous Language Model Systems
Fazl Barez
Funding to establish a safety and interpretability lab within the Torr Vision Group (TVG) at Oxford
Siao Si Looi
12 months funding for 3 people to work full-time on projects supporting AI safety efforts
Jacques Thibodeau
3 month salary for AI safety work on deconfusion and technical alignment.
Bart Bussmann
1-year salary for independent research to investigate how LLMs know what they know.
Chris Leong
9-month university tuition support for technical AI safety research focused on empowering AI governance interventions.
Clark Urzo
A scalable, non-infohazardous way to quickly upskill via digestible, repeatable exercises from papers and workshops.
An association for interdisciplinary interest in AI
Can Rager
We build a scalable "Automated Circuit Discovery" method and investigate "Cleanup Behavior" to advance the interpretability of transformer models.
Aryeh L. Englander
Continuation of a previous grant to allow me to pursue a PhD in risk and decision analysis related to AI x-risks
Jaeson Booker
Collective intelligence systems, Mechanism Design, and Accelerating Alignment
Making a simple, easy to read platorm, where alignment plans and their criticisms can be seen and ranked. Currently in Stage 1.
Miguelito De Guzman
6 months of work: Evaluating a variant of GPT2-XL that can simulate a shutdown activation, aiming to improve alignment theory & develop interpretability tools.