You're pledging to donate if the project hits its minimum goal and gets approved. If not, your funds will be returned.
Reliability and trustworthiness are a foundational bottleneck in AI today and improving them will unlock the full potential of AI across a wider range of domains where those issues are critical.
The project involves understanding, detecting and correcting haĺlucinations and output errors in large language models, with the aim of improving reliability in language models.
I’m an independent researcher working on improving the reliability and trustworthiness of language models. I have a PhD from Cambridge University and have authored 30 publications in computational science witn an h-index of 21.
This work has been developed over the past several months as a self-funded effort.
Results obtained so far are promising with AUROC > 0.80 on hallucination detection.
I’m requesting a small amount of support to cover GPU compute for the final experiments that make the work suitable for journal publication.
I will update the arXiv link and submit the manuscript to the Transactions of Machine Learning Research Journal, once the paper is completed.
Complete the final experimental validation required for publication.
This will be achieved by running additional GPU compute experiments to validate results across multiple model architectures and to assess scaling, robustness, and ablation behaviour prior to submission.
Cloud GPU compute for final experimental runs, approximately 200 hours on an NVidia A100.
The main risks are operational rather than conceptual. These include GPU availability, runtime issues, or inefficiencies in managing compute jobs.
If delays occur, the outcome would be a slower completion of the final experiments rather than a failure of the project itself. The core work and methodology are already established, so the risk is primarily around execution timing.
None so far.