@Austin Thank you!! I know the site design still needs a lot of work! We're working on a rebuild at the moment, which will be ready soon!
To be clear, Tetraspace was a participant.
@KabirKumar
Lead at AI-plans.com
https://ai-plans.com/$0 in pending offers
Background- DevOps, Admin & Marketing
Values- Humanity, Agency, Truth
Cause prioritization - Alignment- currently focused on organizing the field and making a way to recognize bad ideas.
Kabir Kumar
9 months ago
@Austin Thank you!! I know the site design still needs a lot of work! We're working on a rebuild at the moment, which will be ready soon!
To be clear, Tetraspace was a participant.
Kabir Kumar
about 1 year ago
From what I know, AI Safety Careers isn't funding constrained- how would the funding help with this?
Kabir Kumar
about 1 year ago
I think this could be really useful and the folks at Stampy seem to be doing a lot of good work.
Kabir Kumar
about 1 year ago
Awesome!! Thank you very much!! You might be interested to know, that not only has the event produced many very well thought out critiques and got more people involved and interested in AI Safety- especially in what actually goes into making a plan robust- in the first two days we produced an extremely useful document: https://docs.google.com/document/d/1GQbAnRPvONF8TdQtQuga4WOLk58iNh3tTdsVyGpA4AE/edit?usp=sharing multiple people have talked about how useful and easy to use this document is and often expressing confusion as to why no one has made something like it before!
Kabir Kumar
about 1 year ago
Great news! Dr Peter S. Park, an AI Safety postdoc at the Tegmark lab has agreed to be a judge!
Kabir Kumar
about 1 year ago
Excited to say that we have 20 participants for the critique-a-thon so far!
Kabir Kumar
about 1 year ago
1st to 2nd: Making a list of all the ways alignment plans could go wrong.
We'll put together a master list of potential "vulnerabilities" based on existing research and our own ideas. This will give us a checklist to use when evaluating plans.
3rd to 4th : Matching vulnerabilities to plans
Everyone will pick a few alignment plans to look at more closely. For each plan, you'll label up to 5 vulnerabilities you think could apply and point out evidence from the plan that supports them. Include your level of confidence in each label as a percentage.
5th to 8th : Argue for and against the vulnerabilities.
You'll team up with another participant and take turns, with one defending, the other questioning the vulnerabilities suggested in Step 2. This debate format will help strengthen the critiques. We'll swap sides on the 6th and rotate team member on the 8th.
9th to 10th: Provide feedback on each other's arguments.
Review your partner's reasoning for and against the vulnerability labels. Point out any faulty logic, questionable assumptions, lack of evidence, etc. to improve the critiques.
Step 5- one week of judging:
We'll evaluate submissions and award prizes!
The organizers and outside experts will judge all the critiques based on accuracy, evidence, insight, and communication. Cash prizes will go to the standout critiques that demonstrate top-notch critical analysis
Kabir Kumar
about 1 year ago
But if folks want to add more, I'd be happy to increase the prize pool. Though, at some point, it might make more sense to pay the researchers who're being judges.
Kabir Kumar
about 1 year ago
We've already got 13+ attendees with no prize at all and I want to maximize the chances of there being a prize.
Kabir Kumar
about 1 year ago
Thank you!!
It helps that one of the consultants on our team is a highly experienced cybersecurity professional and professor.
Also, I kinda love breaking things and alignment plans are sooo vulnerable!
Kabir Kumar
about 1 year ago
Excited to say that within hours of announcement, we already have 10 people who've joined the critique-a-thon!
Kabir Kumar
about 1 year ago
Researchers interested include:
Dr Tom Everett of DeepMind
Dr Dan Hendrycks of Xai
Dr Roman Yampolskiy
Kabir Kumar
about 1 year ago
Update: Good news!
Kristen W Carlson, an alignment researcher at the Institute of Natural Science and Technology said they like the site! They also said they found several papers on the site, so it seems to already be proving useful!!
A few other researchers have also expressed interest in the site!
Kabir Kumar
over 1 year ago
Thank you for your comment!
I agree, getting the site used and having good networking is very important!
On that front, there's actually quite a bit of good news! I've been reaching out to researchers for less than a week and there are already 4 alignment researchers who are very interested in the site! One has been posting his plans himself, another has asked me to post their plan for them, one has joined the team (Jonathan Ng) and another is working on a plan they're happy to have on the site when it's done!
Esbren, the head of Apart Research is also very interested in site and I've spoken with the creator of aisafety.careers who wants to integrate with the site.
I also had a call with Kat Woods who said she really wanted the site to exist and seemed to think it would provide something very valuable.
It's been very promising to get a really great reception from almost every alignment researcher I've talked to about this- the two sceptics have been folk who either think alignment is impossible or that it is basically impossible to have any judgement of a plan since we can't test it. Those are very important points, which I am looking into seriously.
For | Date | Type | Amount |
---|---|---|---|
Manifund Bank | 6 months ago | withdraw | 370 |
AI-Plans.com | 9 months ago | project donation | +370 |
Manifund Bank | 9 months ago | withdraw | 5000 |
AI-Plans.com | 9 months ago | project donation | +5000 |
Manifund Bank | about 1 year ago | withdraw | 500 |
AI-Plans.com Critique-a-Thon $500 Prize Fund Proposal | about 1 year ago | project donation | +500 |