Rishub Jain
AI Understanding
Matei-Alexandru Anghel
A Safety Framework for Evaluating AI Humanity Alignment Through Progressive Escalation and Scope Creep
Brad Leclerc
An experiment testing whether RLHF training could create selection pressure favoring deceptive AI outputs over honest ones.
Jessica P. Wang
Germany’s talents are critical to the global effort of reducing catastrophic risks brought by artificial intelligence.
Miles Tidmarsh
Open Welfare Alignment Evals for Frontier Models
Mahmud Omar
An open platform to stress-test how LLMs handle bias, pressure points, and clinical decisions. Built on peer reviewed real evidence.
Aria Wong
aya samadzelkava
LLMs scale language, not method. HP turns hypothesis-driven papers into machine-readable maps of variables, controls, stats, and findings for researchers & AI.
Connacher Murphy
A flexible simulation environment for assessing strategic and persuasive capabilities, benchmarking, and agent development, inspired by reality TV competitions.
Cameron Tice
Remmelt Ellen
Adam Boon
An executable reasoning quality framework that checks whether AI-generated arguments are logically sound — not just factually accurate. Live at usesophia.app.
Mateusz Bagiński
One Month to Study, Explain, and Try to Solve Superintelligence Alignment
Aashkaben Kalpesh Patel
Nutrition labels transformed food safety through informed consumer choice, help me do the same for AI and make this standard :)
AISA
Translating in-person convening to measurable outcomes
Hayley Martin
Support my postgraduate law studies and research in AI Governance
Jacob Steinhardt
Krishna Patel
Expanding proven isolation techniques to high-risk capability domains in Mixture of Expert models
Habeeb Abdulfatah
Seeking funding to secure API infrastructure and permanently eliminate the rate limits bottlenecking open-source EA grant evaluation.