Manifund foxManifund
Home
Login
About
People
Categories
Newsletter
HomeAboutPeopleCategoriesLoginCreate
5

Luthien

Technical AI safetyGlobal catastrophic risks
jaiwithani avatar

Jai Dhyani

ActiveGrant
$20,470raised
$500,000funding goal

Donate

Sign in to donate

Project summary

  • Luthien is a non-profit developing AI Control for immediate real-world deployments, based on Redwood Research's AI Control agenda. https://luthienresearch.org/

What are this project's goals? How will you achieve them?

  • Luthien's ultimate goal is to increase the probability that effective AI Control systems will be deployed to mitigate catastrophic risks from frontier AI systems when they are developed.

  • We think getting real-world feedback ASAP on how AI Control systems perform in real-world situations and iterating aggressively to develop practical, effective Control systems significantly increases the odds that effective AI Control systems will be deployed to mitigate otherwise-catastrophic risks.

  • To that end we're developing AI Control systems targeting prosaic, lower-stakes scenarios, like occasionally-misbehaving coding assistants. By doing so, we hope to discover and solve the types of unforeseen problems that emerge when any new type of system is deployed, develop a playbook for effective AI Control deployments, establish standards and best practices for effective AI Control systems, and test and develop effective Control strategies in real-world situations. Additionally, we want to see how far we can push automated red/blue-teaming to develop effetive strategies quickly.

  • Luthien's secondary goal is to establish an AI Safety presence in Seattle, where there is currently a great deal of latent talent but very few opportunities to onboard into the AI Safety space.

How will this funding be used?

  • Funding will be used for salary, API credits, other compute infrastructure, and org logistics like attending ControlConf.

  • Our minimal funding goal ($500) buys us a slightly longer runway (salary, hiring contractors for part-time work, API credits, other compute infrastructure, and org logistics like attending ControlConf).

  • At current funding levels, Luthien is on the verge of being able to hire a second person. At $5000 I'll be reasonably confident that Luthien can hire a second person full-time (because (1) I'll take it as a signal that a relatively small time investment can net sufficient donations to more than pay for the lost time and (2) with two people there will be slightly more slack such that it should be possible to do more focused deep work without needing to devote a large fraction of org resources to just fundraising).

  • (Update May 2025) After running budget scenarios over the next year I've gotten significantly more conservative about how much liquidity Luthien needs to expand hiring. Right now my main priority is keeping the runway long enough to develop a useful thing and start iterating on feedback. I do think that an expanded team can help with this, but with ramp-up time it would take at least several months for that to pay off, and I don't want to make that trade until I'm confident that we have enough liquidity to make it to iterating-on-real-world-feedback while paying the full team. Those thresholds are approximately as follows:

  • $100k: Additional hire (software development focused)

  • $225k: Two additional hires (software development focused)

  • $350k: Three additional hires (software development focused)

  • $500k: Four additional hires (three software development focused, one project management/organization/meta/outreach focused)

  • Beyond this, Between (and beyond) these tiers, donations buy runway and therefore focused-deep-work-time-not-focused-on-fundraising.

  • These amounts will in practice fluctuate as development continues (we'll need less additional runway to get to the point of iterating on feedback)

  • Past $100k it potentially becomes possible to hire a third person.

Who is on your team? What's your track record on similar projects?

  • Luthien currently consists of me, Jai Dhyani. I'm an ex-ML Software Engineer at Meta, MATS alum, and a co-author on "RE-Bench: Evaluating frontier AI R&D capabilities of language model agents against human experts".

What are the most likely causes and outcomes if this project fails?

  • We're unable to develop Control solutions that deliver enough benefit to justify the requisite costs in money, latency, and complexity for real-world use cases.

  • Immediate real-world use cases are sufficiently different across all relevant dimensions such that there's little to no payoff for developing and deploying effective Control systems for high-stakes scenarios.

How much money have you raised in the last 12 months, and from where?

~$187k. $150k of this is a one-time seed grant from the AI Safety Tactical Opportunities Fund meant to get us off the ground and spur further donations. The remainder is from individuals donating ~$5k-$20k each.

Comments6Donations4
Austin avatar

Austin Chen

about 1 month ago

Approving this project -- my understanding is that Control is becoming an important new agenda, and I'm heartened to see this donation following AISTOF's initial seed funding. Curious if @RyanKidd has any more to share about what motivated him to make this grant?

jaiwithani avatar

Jai Dhyani

about 1 month ago

@Austin Thank you! This is incredibly useful.

jaiwithani avatar

Jai Dhyani

8 days ago

@Austin Luthien's 501c3 status is still pending, and I'm currently applying to SFF, which requires either 501c3 status or fiscal sponsorship for non-profits. Would Manifund be able to serve as a fiscal sponsor for purposes of Luthien applying to and potentially receiving a SFF grant?

Austin avatar

Austin Chen

7 days ago

@jaiwithani Yes, we should be able to do this for your SFF application; we've historically done this for one other Manifund grantee. Note that we would ask for 5% of the grant as a fiscal sponsorship fee. Email me at austin@manifund.org to confirm this!

jaiwithani avatar

Jai Dhyani

6 days ago

@Austin Excellent! SFF got back to me and said I could apply with pending 501c3 status, but since the form still asks for an org with documentation proving 501c3 status I'm putting down Manifold for Charity and linking to the IRS determination letter, with a note about Luthien's pending status.

Would it be okay if, in the event SFF wants to issue a grant and Luthien's 501c3 status is still pending, I reach out then to confirm the fiscal sponsorship details? (If this happens, I'm totally fine with the 5% overhead.)

Austin avatar

Austin Chen

4 days ago

@jaiwithani Yes, that's fine!