Project summary
Lyptus Research is an early-stage AI safety research group building on the foundations of two previous Manifund grants. We published our first major work, Offensive Cyber Time Horizons, in April 2026. We are now growing to a team of three and building out a research group collective in Sydney, Australia.
We are out of money. We're requesting bridge funding to keep this team together and operating while we set up as an ACNC-registered charity and prepare larger, workstream-specific grant applications.
What we're building
AI safety nonprofits in San Francisco and London are reportedly talent-starved. Australia has the opposite problem. The talent exists but there is nowhere near enough institutional capacity to translate it into impact. Lyptus Research aims to help provide this capacity.
We're approaching this through a workstream model where each research program earns its own project-specific funding. Our current workstreams are:
1. Cyber and Control evaluations. Our published work sits here. Our research interests lean toward human-grounded studies, evaluations at the intersection of cyber and control, and high-quality science communication that reaches both the AI safety community and policy audiences.
2. Pragmatic Interpretability. Led by Slava Chalnev. Currently in an exploratory phase, with early work on activation oracles and model self-steering via activation probes.
What we delivered
From two small grants totalling $73K, we published Offensive Cyber Time Horizons. A new application of the METR time-horizon methodology to offensive cybersecurity, grounded in a human expert study with 10 professional security practitioners.
Why bridge funding
We're completely out of money (in fact we went over-budget). Our plan is project-specific grants for each workstream, but we want to be genuinely thoughtful about those applications. We'd like to be ambitious with our research directions and that deserves careful scoping rather than a rushed proposal.
This bridge fund keeps the team intact for two months while we:
We are deliberately keeping this ask small and scoped as a bridge. The larger workstream grants will follow.
Prospective Ideas
As stated our research direction at this point is still being scoped. We have prospective ideas, but these will require a lot more thought, and may very likely not reflect what we end up working on at all.
This is why we have decided to just do this bridge grant.
With that said, providing examples is perhaps illustratively useful. Our favourite ideas include:
[Cyber] Real World Red Teaming
Partner with a red teaming organisation and ~5 consenting target businesses of increasing scale (whether by headcount, infrastructure scope, revenue etc.)
Run human red-teams against each target
Run model red-teams against each target
Models are clearly extremely effective at identifying exploits and even broader pentesting. But we believe full end-to-end red teaming still has a ways to go.
Gray Swan and Stanford did the first example of this late last year.
[Cyber & Control] Measuring Human-Grounded Covert Capability on Cyber Tasks
Collect human attacker transcripts through cyber experts across many of the higher quality control settings
Measure model covert capability Elos against human Elos
Data share these transcripts with organisations like Redwood
[Pragmatic Interpretability] Cycle Consistent Activation Oracles
Activation oracles train a model to interpret activations, sidestepping the problem of understanding messy LLM internals directly
We explore cycle consistency training is a way to get around the lack of ground truth training data
Next steps include mixing cycle consistency training with standard activation oracle training, and changes to architecture and training setup.
[Pragmatic Interpretability] Self-Steering
Budget
Personnel (3 staff, 2 months, incl. super & workers comp): $50,000
Back pay, founder shortfall (Feb to mid-Apr 2026): $20,000
Model API credits: $5,000
Infrastructure (AWS): $3,000
SaaS & tooling: $2,000
Travel (international network building): $5,000
Fiscal sponsorship fee (5%): $5,000
Buffer: $10,000
Total: $100,000
Salaries are benchmarked against Australian AI Safety Institute bands.
The team
Sean Peters (Founder) — Software engineer and team lead for 12 years across research domains. Microkernels at Data61, radio astronomy at ICRAR, cancer proteomics at CMRI, cultivated meat at Vow.
Jack Payne (Technical Staff) — Graduated 2025. Worked as an ML engineer while completing AI safety fellowships through TARA, SPAR, and Oxford ARBOx. First author on the cyber horizons paper. Built the evaluation harness and managed the human expert study.
Slava Chalnev (Technical Staff) — Former ML engineer, MATS alumnus, independent mechanistic interpretability researcher, and former startup founder. Published on activation steering with sparse autoencoders and early work on transcoders. Building out a pragmatic interpretability research workstream under Lyptus, with its own larger funding application to follow.
Previous funding
Manifund (Joel Becker), $32,000 USD, Sep 2025. Career transition.
Manifund (Joel Becker), $41,000 USD, Nov 2025. Cyber horizons & attack selection. Cyber horizons completed. Attack selection work shelved.
Total received: $73,000 USD