You're pledging to donate if the project hits its minimum goal and gets approved. If not, your funds will be returned.
I am requesting travel and bridge funding support as an independent researcher attending ICML 2026 (Seoul, 6–12 July 2026), where my accepted paper "On Compositional Learning Behaviours in Formal Mathematics" ([CLBinFM - arXiv:2605.28512](https://arxiv.org/abs/2605.28512)) will be presented at the 3rd AI for Math Workshop (11 July 2026).
Research context.
The paper addresses a diagnostic gap in the evaluation of frontier formal theorem proving systems. Standard benchmarks collapse multidimensional model capabilities into a single accuracy metric, concealing the distinction between shallow pattern recombination and genuine in-context symbolic abstraction. Cross-evaluating ten state-of-the-art Lean 4 theorem provers using S2B-LM — an adaptation of the Symbolic Behaviour Benchmark — I establish via exact permutation testing that Compositional Learning Behaviour (CLB) competency is a statistically significant necessary condition for Olympiad-level theorem proving performance (miniF2F > 75%; p = 0.004), after ruling out model scale as a confound.
Overall, the paper establishes that without explicit diagnostic benchmarks separating compositional learning from shallow pattern matching, it remains ambiguous whether state-of-the-art provers are acquiring authentic symbolic processing capabilities or merely overfitting to localised tactical distributions. This finding has direct methodological implications for any domain where single-metric benchmark performance is used to assess whether a model has crossed a capability threshold. While the paper itself focuses on formal mathematics, the structural concern generalises: the AI safety evaluation community's reliance on benchmark proxies to assess capability thresholds faces the same diagnostic gap that this paper identifies and empirically characterises in this domain.
Funding context.
This request serves two co-equal purposes depending on a pending funding decision:
- Scenario A — G-Research NextGen grant not awarded (decision expected end of June 2026): Manifund funds cover lean ICML 2026 travel costs (flights, accommodation, meals and local transport) as itemised below.
- Scenario B — G-Research NextGen grant awarded: G-Research funds cover flights and accommodation(up to £2,000). Manifund funds are redirected — with full transparency — toward one month of bridge living costs (£1,600–£2,000) and independent research continuity, sustaining my open-source safety evaluation work and enabling focused preparation for the MATS Autumn 2026 programme (currently at step 2, decision pending) while I complete my PhD revision (August 2026 deadline) and pursue the current research programme as an unaffiliated researcher without institutional income.
In both scenarios, the total requested amount remains £2,140 (~$2,700) and the funds directly enable my continued presence and productivity as an independent AI safety evaluation researcher during a critical funding gap.
Institutional context.
ICML 2026 has recognised my reviewing contribution this cycle with a Gold Reviewer Award (6 papers reviewed), which provides complimentary conference registration. My postdoctoral contract at the University of Edinburgh concluded 31 March 2026; I am completing PhD thesis revisions at the University of York as an independent researcher with no departmental budget, no active grant, and no institutional income.
This travel project has four goals:
Present and disseminate the accepted paper’s findings to the AI for Math and formal theorem proving community at ICML 2026, engaging directly with researchers working on frontier mathematical reasoning systems ;
Collect targeted technical feedback from evaluation and AI safety researchers at the workshop and main conference to inform the next phase of the research programme, such as expanding the CLB evaluation framework beyond formal mathematics to other structured reasoning domains, such as coding.
Establish research connections with AI safety and evaluation methodology researchers at METR, Apollo Research, UK AISI, and adjacent organisations, for whom the paper's finding — that capability proxies can produce false confidence in frontier model assessments — is directly relevant to their evaluation programme.
Consolidate my transition into the AI safety and evaluation research community as an independent researcher. My postdoctoral contract concluded in March 2026; ICML 2026 represents a critical opportunity to establish the visibility and direct relationships needed to secure my next research position in AI safety evaluation or AI for formal mathematics, areas where my research programme is directly relevant.
I will achieve these goals by attending in person, presenting the paper at the AI for Math Workshop, actively participating in the main conference programme, and converting feedback and connections into a prioritised follow-up research plan.
Scenario A — G-Research NextGen grant not awarded
The funding covers lean conference travel costs:
- £1,348: return economy flights, Edinburgh–Seoul (5–12 July 2026)
- £652: accommodation contribution (7 nights, partial toward costs of £400–£900 depending on availability and distance to venue)
- £140: meals and local transport (7 days at £20/day)
- £0: conference registration (covered by Gold Reviewer Award)
Total requested: £2,140 (~$2,700)
Any shortfall in accommodation or daily costs will be covered personally; the trip is fully viable at this funding level.
Scenario B — G-Research NextGen grant awarded
G-Research funds cover flights and accommodation(up to £2,000). Manifund funds are redirected toward:
- £1,600–£2,000: one month of bridge living costs (fixed costs: ~£950 rent, broadband, council tax, energy; variable costs: ~£650 groceries and essentials) as an unaffiliated independent researcher without institutional income
- Remainder: independent research continuity costs supporting open-source safety evaluation work and enabling focused preparation for the MATS Autumn 2026 programme (currently at step 2, decision pending) while I complete my PhD revision and pursue the current research programme.
The current paper, [CLBinFM](https://arxiv.org/abs/2605.28512), is sole-authored independent research, conducted without institutional research funding during the revision phase of my doctoral studies at the University of York (IGGI CDT, EPSRC EP/L015846/1). It represents the culmination of this research programme, spanning over five years of work on compositional generalisation, symbolic behaviours, and evaluation methodology in AI systems.
[CLBinFM](https://arxiv.org/abs/2605.28512) has been accepted to the 3rd AI for Math Workshop at ICML 2026, providing external peer validation of the work's relevance to the frontier AI for mathematics and evaluation methodology communities.
Publication record:
1. [CueTip](https://doi.org/10.1145/3721238.3730742) (SIGGRAPH Conference Papers 2025, second author), an interactive and explainable physics-aware pool assistant combining physical simulation with language model reasoning, developed during my postdoctoral research at the University of Edinburgh;
2. [A Comparison of Self-Play Algorithms Under a Generalized Framework](https://doi.org/10.1109/TG.2021.3058898) (IEEE Transactions on Games, second author), a peer-reviewed journal contribution to multi-agent reinforcement learning;
3. and a corpus of first-authored preprints spanning emergent communication, the Symbolic Behaviour Benchmark, [EReLELA](https://openreview.net/forum?id=KO6lHsx08E), and the [Differentiable Language Model framework](https://arxiv.org/abs/2602.11044).
Reviewing & Service (2021–present)
- ICML 2026: Gold Reviewer Award (6 manuscripts)
- ICLR 2022: Co-organiser & Best Reviewer, Emergent Communication Workshop
- reviewed across ICML, NeurIPS, ICLR, and AAAI
Primary risk: travel costs are not covered in time to book flights at current prices (confirmed at £1,348 as of 29 May 2026; prices may increase).
Most likely outcome under failure: remote participation if available, but it would mean I cannot present my paper as it has been accepted as a poster.
Worst case: inability to present, loss of networking and feedback opportunities that are directly relevant to securing the next research position and directions, and delay to the follow-up research programme.
I have applied for a G-Research NextGen travel grant (decision pending, end of June 2026) and for the MATS Autumn 2026 programme (advanced to step 2, decision pending).
No other funding has been raised in the last 12 months.
This Manifund project is part of a broader effort to bridge the funding gap between my postdoc and next position while completing my PhD revision.
There are no bids on this project.