Manifund foxManifund
Home
Login
About
People
Categories
Newsletter
HomeAboutPeopleCategoriesLoginCreate
kaminovs avatarkaminovs avatar
Sergejs Kaminovs

@kaminovs

Independent researcher working on AI evaluation and agent reliability. Built CRepair — a benchmark and runtime enforcement layer for measuring structural self-repair in LLM agents. Two preprints published (Zenodo). Background in data analytics and systems thinking.

https://github.com/kaminovs/crepair
$0total balance
$0charity balance
$0cash balance

$0 in pending offers

About Me

I work at the intersection of AI safety and empirical evaluation. My current research programme centres on a question that existing benchmarks don't answer: when an AI agent's reasoning breaks down, can it detect the failure, repair it, and verify the repair worked?

I built CRepair to measure this. The benchmark revealed that under standard conditions, LLMs achieve 0% verification rate — they detect and repair failures but never close the loop. A follow-up ablation study showed that structured runtime intervention raises this substantially (+0.333 mean improvement) while generic re-prompting barely moves the needle (+0.051).

Both papers are published as open preprints. All code is open source. I'm looking for funding to run cross-model replication (GPT-4o, Gemini) and develop the research into a community benchmark with a public leaderboard.

Background: independent researcher based in the UK, with a day job in casino data analytics. This research is built in my own time.

Projects

CRepair: Cross-Model Replication of LLM Self-Repair Benchmark

pending admin approval