What progress have you made since your last update?
I worked on this project for 7 months instead of the 2 months I had written in the project, and prioritized building features and fixing bugs in inspect_ai. When I was blocked or didn’t have other things to work on, I completed open issues in inspect_evals.
I authored 56 merged PRs in Inspect repos:
28 in inspect_ai
25 in inspect_evals
2 in inspect_swe
1 in inspect_scout
Some features I contributed:
Support for Gemini Computer Use
Score editing
S3 conditional writes
Google agent bridge
Gemini CLI agent
Sandbox configuration using Compose files
API key refreshing
Adapter for Harbor evals
Examples I produced that researchers can refer to:
Running Inspect evals (in Docker containers) inside an Inspect eval
How to intercept, modify, and control HTTP requests made by an agent
Meridian Labs based their Inspect Harbor package on my Harbor task implementation.
I also developed my own package Inspect Modal Sandboxes. Meridian based their Modal Sandbox in their Inspect Sandboxes package on this.
I also wrote docs for features and looked into bugs, for which I didn’t author merged PRs.
What are your next steps?
Continue to work with Inspect maintainers on new, substantial features in Inspect repos.
Secure long-term funding.
Is there anything others could help you with?
Donating another 10k for the next month until long-term funding is confirmed.
I really want to continue doing this work and have been doing it unpaid for the past 5 months until I secure longer-term funding. I’ve updated the project for a third $10,000.
If any other grant meets my funding needs, I’ll coordinate to avoid double funding.