Decode Research - Compute for Generating Dashboards & Autointerp

Project summary

This is a targeted grant for a specific set of data to generate for Decode Research, on Neuronpedia. These are "feature dashboards" and autointerp for a new set of SAEs that we wish to generate. The amount has been pre-agreed upon.

What are this project's goals and how will you achieve them?

This is for a specific compute-heavy task called "feature dashboards" for Decode Research.

We have already generated about 20% of these dashboards, an example of a dashboard is: http://neuronpedia.org/gemma-2-2b/0-gemmascope-res-16k/34

Here's a listing of each SAE that will have dashboards (some are still hidden for now):

https://www.neuronpedia.org/gemma-scope#browse

The overall Gemma Scope project on Neuronpedia is here: https://www.neuronpedia.org/gemma-scope#main

We plan to, in all, generate 40,000,000 dashboards and do auto-interp for them too. Auto-interp is basically asking an AI like GPT4 to interpret what a feature is about.

How will this funding be used?

We will spend all of it on either feature dashboards or auto-interp explanations. We expect to be cost-neutral on this - eg it goes entirely to compute.

Who is on your team and what's your track record on similar projects?

Joseph Bloom, Curt Tigges, Johnny Lin of Decode Research. We have generated ~20% of the dashboards already.

What are the most likely causes and outcomes if this project fails? (premortem)

This will likely not fail unless something really weird happens, like all AI compute resources are suddenly 10x the expected price.

What other funding are you or your project getting?

This is a targeted grant for a specific task. Our project has gotten other funds from various sources, but none of it was delegated for this specific task.