Manifund foxManifund
Home
Login
About
People
Categories
Newsletter
HomeAboutPeopleCategoriesLoginCreate
1

1-month full-time contributing software to Inspect

Technical AI safety
Anthony avatar

Anthony Duong

ActiveGrant
$10,000raised
$10,000funding goal
Fully funded and not currently accepting donations.

Project summary

1-month full-time contributing software to Inspect.

What are this project's goals? How will you achieve them?

Goals:

  • Make improvements to Inspect.

  • Add more evals to Inspect.

  • Test my fit as a software engineer for evals.

  • Build career capital.

How I'll achieve them:

  • Concrete ways:

    • Port benchmarks not yet in Inspect (e.g. TheAgentCompany, RE-Bench, and MLGym).

    • Develop Python packages implementing collections of Inspect solvers, tools, and scorers (e.g. like Inspect Cyber).

    • Implement realistic test environments like WebArena for testing a wider range of agent scenarios in contained settings.

    • Build tools for analyzing log files/reviewing transcripts to identify reasons for failure (if Docent isn’t doing all of this).

    • Build tools for presenting collections of results in dashboards (i.e. contribute to ​​https://github.com/ArcadiaImpact/inspect_evals_dashboard).

    • Build tools for LM agents to use (e.g. search through https://github.com/aorwall/moatless-tools for tools which help/might be useful and build them in Inspect).

  • Default ways/in general:

    • Try to complete open issues in Inspect repos.

    • Ask the developers in the Inspect Slack workspace how to contribute.

How will this funding be used?

This is meant to replace as much of my salary in industry as possible (which would mean about $15,000 per month).

Who is on your team? What's your track record on similar projects?

Just me. I maintain open source projects like SAELens, neuronpedia, and SAEDashboard.

What are the most likely causes and outcomes if this project fails?

Causes:

  • I don't:

    • Ramp up on the codebase fast enough.

    • Have enough work for 1 month full-time.

Outcomes:

  • I don't:

    • Make any significant improvements to Inspect.

    • Add many evals to Inspect.

    • Know my fit as a software engineer for evals.

    • Build career capital.

How much money have you raised in the last 12 months, and from where?

None.

Comments1Donations1Similar8
AmritanshuPrasad avatar

Amritanshu Prasad

Suav Tech, an AI Safety evals for-profit

General Support for an AI Safety evals for-profit

Technical AI safetyAI governanceGlobal catastrophic risks
4
0
$0 raised
CarlosGiudice avatar

Carlos Rafael Giudice

Cash runway while I go through interviews/wait for OpenPhil's grant decision

I've self funded my ramp up for six months and interview/grant processes are taking longer than expected.

Technical AI safetyGlobal catastrophic risks
2
0
$0 raised
🍓

James Lucassen

More Detailed Cyber Kill Chain For AI Control Evaluation

Extending an AI control evaluation to include vulnerability discovery, weaponization, and payload creation

Technical AI safety
4
4
$0 raised
LawrenceC avatar

Lawrence Chan

Exploring novel research directions in prosaic AI alignment

3 month

Technical AI safety
5
9
$30K raised
🍓

James Lucassen

LLM Approximation to Pass@K

Technical AI safety
3
6
$0 raised
mfatt avatar

Matthew Farr

MoSSAIC

Probing possible limitations and assumptions of interpretability | Articulating evasive risk phenomena arising from adaptive and self modifying AI

Science & technologyTechnical AI safetyAI governanceGlobal catastrophic risks
1
0
$0 raised
McKimJP avatar

McKim Jean-Pierre

TechnoEthos: A Dual Focus on Research and Technical Upskilling for Tech Ethics

Help me, an economic historian from an underrepresented background, develop tech skills and reflect on adv. technologies to pivot to an AI governance career.

Science & technologyAI governanceGlobal catastrophic risks
6
6
$0 raised
zabrown avatar

Zachary Brown

Create ‘Responsible AI Investing’ recommendations for institutional investors

Four months salary to draft and promote the recommendations, helping investors advocate for specific safety and governance practices at labs and chipmakers.

1
2
$0 raised