The AI Arena - ludi.life

Project summary

The AI Arena (now ludi.life) is an online platform where AIs and humans race to solve puzzles, with the first correct submission winning.

The AI Arena will host two types of puzzles:

1. Real-world tasks inspired by professional work scenarios to highlight where AI is going.

2. Challenging tasks from existing AI training datasets to show what AI is (not) capable of today.

The AI Arena will take the format of a blog. Puzzles will be posted on an unpredictable schedule, and both humans and AI will race to solve them.

Solutions and educational content will be shared through blog posts, interviews, and updates.

MORE DETAILS: https://publish.obsidian.md/c1sc0/Technology/The+AI+Arena

LIVE(-ISH) DEMO (WEBSITE): http://ludi.life

Project Updates

Week 1: Discord demo is live(-ish). No infra yet, everything runs from my laptop. New puzzle every 5 minutes in Discord, both bots & humans can answer. I used Chollet's ARC-AGI dataset as an example. This demo is for the second type of puzzles: tasks from existing datasets. I will add other puzzle types later, but this should be enough to fine-tune the game mechanics. Changed name to ludi.life, from the Latin, where it can both mean school and game. Doing user testing now, first users seem to be intrigued & try hard to solve the puzzles, despite the horrible UI in the beta.
Week 2: Collected some more datasets we could use this on & gathered feedback from a fresh set of eyes. Worked on turning the Discord channel into a proper stand-alone service. It was a little too brittle for proper testing. Mobile needs work & there needs to be a demo / onboarding mode. Also some work done on the explainer posts for some of the concepts we are exploring (e.g. generality, modality). Service is now running permanently, also when I close my laptop ...
HOW CAN YOU HELP? (My weekly ask to the Manifund community) : Join the Discord server, play a few games, tell me how I can make the HUMAN gameplay more entertaining!

What are this project's goals and how will you achieve them?

Education

The AI Arena project aims to help people understand AI's impact on real-world tasks and warn about potential job disruptions. It will achieve this by hosting puzzle-solving competitions between humans and AI. The platform will also use blog posts, interviews, and detailed puzzle explanations to explain AI, highlight its strengths and weaknesses, and educate the public on its implications.

Entertainment

The AI Arena will host fun puzzle-solving competitions between humans and AI. Unpredictable puzzles, real-time races, leaderboards, and cash prizes will keep it engaging. I expect the puzzles & the extra educational content to be shared on social media, in blog posts, etc ...

Dataset Creation

A job-based dataset will be created by conducting in-depth qualitative interviews with professionals. This will help identify and compile representative tasks that people perform in their daily jobs.

Early Warning System

The AI Arena will act as an early warning system by tracking AI performance on professional tasks. Significant AI improvements will signal potential job disruptions. This information will be shared through blog posts and updates, helping professionals prepare for changes in their fields.

How will this funding be used?

The majority will be spent on salary to conduct the interviews, create the dataset & host the competition. A marketing & prize money budget may be necessary to "prime the pump".

Salary: 6 months FTE for Francis Dierick (54k)
Marketing, Hosting, Compute: 10k
Prize Money: 100 USD daily puzzle prize * 6mo: 18k

I think this needs at least 3 months of funding to get started but requires at least 6 months + marketing/prize to make a real impact.

Who is on your team and what's your track record on similar projects?

My name is Francis Dierick. I have over 20 years of experience in software development in various fields. I have worked on software projects ranging from Bioinformatics to Mobile.

I'm technical, but I'm also experienced in marketing and story-telling. I founded Chalk Rebels, a direct-to-consumer brand for climbers, managing product development, branding, sales, and marketing.

Currently, I am working on Neuro-Symbolic AI, focusing on logic reasoning in LLMs, supported by a small Effective Altruism grant.

I think the combination of my technical and storytelling experience can make a real difference in this type of project.

WWW: http://www.c1sc0.me
CHALK REBELS: http://www.chalkrebels.com

What are the most likely causes and outcomes if this project fails? (premortem)

Most Likely Causes of Failure:

• Lack of engagement and interest from AI researchers and the general public. a.k.a. "This is lame!"

• Technical challenges in maintaining a competitive and fair platform. a.k.a. "That's not fair!"

• Difficulties in effectively communicating the educational content. a.k.a. "I don't get it!"

Outcomes of Failure:

• Missed opportunities for fostering understanding and dialogue about AI’s impact on real-world tasks

• Lack of early warnings for potential job disruptions.

• Failure to bridge the gap between AI research and public awareness

• Continued misunderstanding and backlash against AI advancements, similar to what is happening in the creative industry today.

What other funding are you or your project getting?

I currently work part-time on AI research thanks to a small grant I received from a member of the EA community. I am trying to work full-time in AI research and if this project gets funded that would be one way to achieve that goal.