I think Epoch has done truly outstanding work on core trends in AI progress in the past few years. I'm also excited by their recent foray into benchmarking in the form of FrontierMath. I think highly of core team members involved in the project. I found our initial discussions about this project very promising.
Better benchmarks that help us forecast time to AGI (and especially time to relevant capabilities, such as automated AI research) and do so in a highly credible and scientific way are very valuable for informing policymakers and catalyzing important policy efforts.
Donor's main reservations
It's a pilot, it might not work.
Epoch has other funding—but not for this effort, and benchmarking is especially expensive (API calls, labelers).
Process for deciding amount
I reviewed a proposed budget. (Confidential, more on request from Manifund.)
Conflicts of interest
Please disclose e.g. any romantic, professional, financial, housemate, or familial relationships you have with the grant recipient(s).
No COIs.