Lyptus Research: Bridge Funding

Technical AI safety

Sean Peters

ActiveGrant

$150,000raised

$600,000funding goal

Donate

Edit: The scope of this grant application has increased based on feedback. See comments

Project summary

Lyptus Research is an early-stage AI safety research group building on the foundations of two previous Manifund grants. We published our first major work, Offensive Cyber Time Horizons, in April 2026. We are now growing to a team of three and building out a research group collective in Sydney, Australia.

We are out of money. We're requesting bridge funding to keep this team together and operating while we set up as an ACNC-registered charity and prepare larger, workstream-specific grant applications.

What we're building

AI safety nonprofits in San Francisco and London are reportedly talent-starved. Australia has the opposite problem. The talent exists but there is nowhere near enough institutional capacity to translate it into impact. Lyptus Research aims to help provide this capacity.

We're approaching this through a workstream model where each research program earns its own project-specific funding. Our current workstreams are:

1. Cyber and Control evaluations. Our published work sits here. Our research interests lean toward human-grounded studies, evaluations at the intersection of cyber and control, and high-quality science communication that reaches both the AI safety community and policy audiences.

2. Pragmatic Interpretability. Led by Slava Chalnev. Currently in an exploratory phase, with early work on activation oracles and model self-steering via activation probes.

What we delivered

From two small grants totalling $73K, we published Offensive Cyber Time Horizons. A new application of the METR time-horizon methodology to offensive cybersecurity, grounded in a human expert study with 10 professional security practitioners.

Why bridge funding

We're completely out of money (in fact we went over-budget). Our plan is project-specific grants for each workstream, but we want to be genuinely thoughtful about those applications. We'd like to be ambitious with our research directions and that deserves careful scoping rather than a rushed proposal.

This bridge fund keeps the team intact for two months while we:

Finalise workstream plans
Incorporate as an ACNC-registered Australian charity

We are deliberately keeping this ask small and scoped as a bridge. The larger workstream grants will follow.

Prospective Ideas

As stated our research direction at this point is still being scoped. We have prospective ideas, but these will require a lot more thought, and may very likely not reflect what we end up working on at all.

This is why we have decided to just do this bridge grant.

With that said, providing examples is perhaps illustratively useful. Our favourite ideas include:

[Cyber] Real World Red Teaming

Partner with a red teaming organisation and ~5 consenting target businesses of increasing scale (whether by headcount, infrastructure scope, revenue etc.)
Run human red-teams against each target
Run model red-teams against each target

Models are clearly extremely effective at identifying exploits and even broader pentesting. But we believe full end-to-end red teaming still has a ways to go.

Gray Swan and Stanford did the first example of this late last year.

[Cyber & Control] Measuring Human-Grounded Covert Capability on Cyber Tasks

Collect human attacker transcripts through cyber experts across many of the higher quality control settings
Measure model covert capability Elos against human Elos
Data share these transcripts with organisations like Redwood

[Pragmatic Interpretability] Cycle Consistent Activation Oracles

Activation oracles train a model to interpret activations, sidestepping the problem of understanding messy LLM internals directly
We explore cycle consistency training is a way to get around the lack of ground truth training data
Next steps include mixing cycle consistency training with standard activation oracle training, and changes to architecture and training setup.

[Pragmatic Interpretability] Self-Steering

Give models tools to apply activation steering interventions on themselves
Observe behavior across various experimental setups

Budget

Personnel (3 staff, 2 months, incl. super & workers comp): $50,000
Back pay, founder shortfall (Feb to mid-Apr 2026): $20,000
Model API credits: $5,000
Infrastructure (AWS): $3,000
SaaS & tooling: $2,000
Travel (international network building): $5,000
Fiscal sponsorship fee (5%): $5,000
Buffer: $10,000
Total: $100,000

Salaries are benchmarked against Australian AI Safety Institute bands.

The team

Sean Peters (Founder) — Software engineer and team lead for 12 years across research domains. Microkernels at Data61, radio astronomy at ICRAR, cancer proteomics at CMRI, cultivated meat at Vow.

Jack Payne (Technical Staff) — Graduated 2025. Worked as an ML engineer while completing AI safety fellowships through TARA, SPAR, and Oxford ARBOx. First author on the cyber horizons paper. Built the evaluation harness and managed the human expert study.

Slava Chalnev (Technical Staff) — Former ML engineer, MATS alumnus, independent mechanistic interpretability researcher, and former startup founder. Published on activation steering with sparse autoencoders and early work on transcoders. Building out a pragmatic interpretability research workstream under Lyptus, with its own larger funding application to follow.

Previous funding

Manifund (Joel Becker), $32,000 USD, Sep 2025. Career transition.
Manifund (Joel Becker), $41,000 USD, Nov 2025. Cyber horizons & attack selection. Cyber horizons completed. Attack selection work shelved.
Total received: $73,000 USD

donated $149,980

Joel Becker

about 1 month ago

Recommended an additional $50k to Lyptus on same thesis as the below. I'm still curious about Lyptus absorbing more than this, but I only had another $50k in my Manifund pot! I've reached out to some funders about considering funding Lyptus further. I hope my comment below can serve as some evidence. For now, without more regrantor recommendation budget, I'm bowing out. Wishing Sean et al. the best of luck!!

Aashka Patel

about 1 month ago

I have confidence in Sean based on his previous work, and I think locations with great talent (like Australia, India, Germany, etc) must be funded appropriately in order to create institutional capacity to nurture that talent. I really really hope that this project gets funded. All the best, Sean and #TeamLyptus :)

Sean Peters

about 1 month ago

To start! Thanks Joel! I very sincerely appreciate the candid and clearly worded feedback. I spent a bunch of time reflecting on this today.

With this in mind, I've drafted two budget tiers today with the information I have. This would be spent over two workstreams.

Our initial bridging fund was allocated to 2 months, with lower API credit and contractor asks, while we scoped future work. These tiers would have us operational for 6 months, and greatly increase the API credit and contractor budgets.

Tier 1: $300K

Workstream 1 — Cyber & Control:

- Salaries: $100K

- Model API credits: $50K

- Red team contractors / human experts: $50K

Workstream 2 — Pragmatic Interpretability:

- Salary + compute: $60K

Backpay + Overheads + Buffer: $40K

Tier 2: $600K

Workstream 1 — Cyber & Control:

- Salaries: $160K (supports an additional senior cyber researcher hire)

- Model API credits: $150K

- Red team contractors / human experts: $150K

Workstream 2 — Pragmatic Interpretability:

- Salary + compute: $60K

Backpay + Overheads + Buffer: $40K

What we're thinking

Workstream 2 is nascent and exploratory. I am particularly excited about upcoming work exploring self-steering via activation probes with potential applications in model welfare, model preferences and speculatively alignment stability. This workstream is a first step to support a core mission of Lyptus. Creating and growing institutional capacity, support and direction for untapped talent in Australia.

Workstream 1 is our larger investment. Through the course of our own work (and further so with the recent Mythos model card), it's plainly clear to us that cyber evaluations are materially not keeping up with capability growth. Honestly, the accelerating progress did take me by surprise and our dataset would certainly be saturated at higher token budgets. That said, while these steepening trendlines show no signs of stopping, we do believe current absolute capabilities can be overstated. Full red teaming engagements against even moderate blue teams are a very different proposition to isolated benchmark tasks.

To ground this with an illustrative example of where our thinking is. We're currently scoping the tractability of scaling up the ideas presented in the Artemis paper. Evaluating AI agents via real red teaming engagements and comparing their performance to professional human red teamers. This captures the full end-to-end scope of an attacker against real production organisations.

This is clearly operationally difficult. Though early conversations with senior offensive security leadership suggest it is tractable. We believe with the right people involved we can design incentive structures that work for all parties. A typical engagement with two professionals runs around $30K USD per week. We would ramp up starting with smaller targets.

This is speculative. We are having early conversations, and we are genuinely in a phase of figuring things out. It is our current leading candidate project but not our only option. What it represents is our broader interest in human grounded studies and science communication. Results that are quickly interpretable beyond the AI security ecosystem. We think this suits our strengths better than building cyber benchmark tasks, which requires deeper offensive security expertise than we have in house.

What the tiers mean

At tier 1 we would more likely prioritise budget friendly work like human-grounded covert capability on cyber tasks or human-grounded studies on cyber stealth benchmarks. If we had strong multi-organisation traction on the red teaming idea we would still pursue it, but budget constraints have a way of subtly pressuring priorities against more ambitious work.

At tier 2 the more ambitious work becomes realistic. A senior cyber researcher hire, 2-3 red teaming engagements at small to medium scale with client co-financing, and enough API credits to properly test models at adequate token budgets.

Other ideas in this space would require internal budget rescoping but the broad structure holds.

Austin Chen

about 1 month ago

Approving this project! I'm glad that @joel_bkr and others at METR are so excited for Sean's work, and from my perspective, this kind of active, expert-led, fast grantmaking is where the regranting mechanism shines.

I also appreciate Joel taking the time to lay out his considerations in making this grant, which are hopefully helpful for Sean and others in similar positions. (And @seanpetersau, insofar as you would be open to accepting more funding through this Manifund project, let me know; we're very happy to increase the funding goal on your behalf!)

Sean Peters

about 1 month ago

Thanks @Austin. It appears it was simple enough to edit myself!

donated $149,980

Joel Becker

about 1 month ago

(Noting that after conversation with Nate it's now less obvious to me that Lyptus should focus on dramatically harder cyber tasks in particular. Although I continue to think that there is a very, very, very high premium on dramatically harder tasks in general. Want to leave the focus more flexible depending on Lyptus' taste.)

donated $149,980

Joel Becker

about 1 month ago

Here's my updated sense of where Sean and co. are at, extremely heavily influenced by colleagues, especially Nate Rush:

My original comment on Sean continues to stand up well today. He appears to be among the very most competent people building on top of METR work.
The work he links above was competently executed. Their baselines/time estimates are reasonable (less common than you might think), they have a high attention-to-detail vibe, the write-up is fairly impressive and not overclaiming, etc. The work compares favorably to UK AISI's analogous work.
Unfortunately, best guess is that all of their tasks would saturate for most recent models.

Coming out of this:

I think it's a shame that Sean needed to dip into personal funds to pay for this project.
I consider it ~totally obvious that Lyptus' work is worthy of further funding.
I really, really, really, really, really want Lyptus to (1) throw more AI labor at their problems and (2) create dramatically harder cyber tasks. I think the end work will be much more useful to the degree that these things are true, less helpful to the degree they're not.
My best guess is that unfortunately Sean is continuing his streak of under-asking for funding, wanting to bring about higher-depth proof of concepts, clarity of workstreams etc. than I feel I need to be confident for $Xk bar. From my perspective, we're in a state of extreme triage; I do not want Lyptus to be wasting time fundraising. (And to some extent wasting my time doing larger number of funding rounds!) I would like Sean to expand the maximum funding here by a minimum of $20k (so that he can absorb $20k backpay + $100k prospectively), and very plausibly by a lot more than that. I am not at all saying that I would be comfortable at $500k right now, but I think that's where we should be starting the conversation. (As in, that does not obviously seem like the maximum to me, although it doesn't feel like minimum either.)

I'm delighted to recommend funding Lyptus' maximum here; I hope they can shift work to more ambitious ends that I think would be more helpful, and I hope they ask for more funding now.

donated $20

Pip Foweraker

about 1 month ago

I have confidence in Sean and Jack's previous work and think it's worth boosting efforts to keep things like this operating in Australia. Bridging the ACNC registration gap seems worthwhile.