@Greg_Colbourn It might be possible to delay AGI for a short while, but I honestly don't think we'll be able to delay it for that long. And even if we get a delay, there's still the question of what to do with the delay.
@casebash
$26 in pending offers
Chris Leong
2 days ago
@Greg_Colbourn It might be possible to delay AGI for a short while, but I honestly don't think we'll be able to delay it for that long. And even if we get a delay, there's still the question of what to do with the delay.
Chris Leong
2 days ago
Congrats on reaching 134k subscribers that's a major achievement!
For what it's worth, I wish that the podcast would lean slightly more towards trying to maintain high-quality epistemics. Unfortunately, AI safety is a very complex issue and it's really not that straightforward at all in terms of what needs to be done. We need people not just to get concerned, but to also have as accurate a picture of our situation as possible.
I think AI Frontiers mostly has the right idea in terms of who they've chosen to target:
"Imagine you’re writing to an undergraduate roommate who’s studying in a different field. Assume your audience is intelligent, but do not overestimate the time they can give you, or the prior knowledge they bring. Avoid jargon to increase accessibility for a broad audience. Whenever possible, use clear, concise language or examples to explain concepts in plain language, and favor active voice over passive constructions."
Chris Leong
2 days ago
I don't have a deep knowledge of evals, but I agree with others that this proposal seems really good, at least in theory.
One point of difference I have with Laurence that I think that this method could be useful even if adoption is limited for frontier model evals. Empirical confirmation that this method works well in practice would be valuable in and of itself as it would clarify the Bayesian perspective as a useful conceptual frame. Similarly, if this framework proves applicable, it could improve our scientific understanding of model capabilities.
Chris Leong
2 days ago
This is a fascinating project.
One suggestion: I don't think we can assume that an AI system will be able to perfectly adapt principles to local contexts, so there needs to be some mechanism for feedback to flow back up the system.
Chris Leong
2 days ago
I was highly impressed with how well Manifest was run. I see this as evidence that their team would be suitable to run a project like Mox as well.
Outreach in San Francisco/the broader Bay Area seems highly underrated by the AI safety community. It seems much more tractable to attempt to shift attitudes in the Bay Area than the conversation in the US as a whole or the conversation globally, yet attitudes in the Bay Area still seem likely to have a significant impact on how AI goes.
Chris Leong
2 days ago
Three previous employees of Epoch just split off to launch Mechanize. This project seems like it could be acceleratory. I think it would be useful for Epoch to provide an update on whether they plan to take any action to reduce the chance of anything similar happening in the future (or alternatively whether they think it wouldn't make sense for them to try to avoid similar occurrences in the future).
Chris Leong
2 days ago
I strongly agree with this comment that Ryan Kidd left on TARA and I think it applies to this program as well:
"As with all training programs that empower ML engineers and researchers, there is some concern that alumni will work on AI capabilities rather than safety. Therefore, it’s important to select for value alignment in applicants, as well as technical skill."
Have you given much thought to this? (You should probably think carefully about what you want to say publicly as providing too much information may make it easier for folks to hack any attempt to assess them).
My concern isn't just about alumni working on AI capabilities, it's that many people would absolutely love a free ML bootcamp and AI safety is still a relatively niche interest. So having some kind of filtration mechanism seems important to prevent the impact being diluted.
I guess it would be possible to try to convince folks about the importance of safety during a bootcamp, but I think it'd be challenging. The Arena curriculum is quite intensive, which makes it hard to squeeze in time for people to deeply reflect on their worldview. Also, I'm just generally in favour of programs that do one thing well, since adding more goals makes it harder to hit each individual one out of the park.
If you think there aren't enough folks locally who are interested in AI safety/interpretability, you may want to consider running a variant of Condor Camp or ML4Good instead. I don't know what exactly is in their curriculum, but my impression is that these programs might be more suitable if you're aiming for a mix of technical upskilling and outreach.
Chris Leong
5 months ago
I'll vouch for the quality of the AI Safety Events & Training newsletter.
I guess the main point I'd like clarity on is their plan for increasing distribution of this newsletter.
Chris Leong
7 months ago
You may want to consider applying to the Co-operative AI Foundation for funding in the future. I don't know if they would go for it, they seem to have a more academic focus, but there's a chance that they would go for it.
Chris Leong
about 1 year ago
This is a cool project that might help improve the conversation around these issues.
Some people might be worried about hype, but there's already so much hype, the harms are likely marginal.
You may want to consider linking people to an AI Safety resource if you think your site may get a lot of traffic, then again, you might not if you think that'd make people more suspicious of the results.
Another option to consider would be an ad-supported model. I'm not suggesting Google Ad words, but you might be able to find an AI company to sponsor you.
Chris Leong
about 1 year ago
@casebash I should state my reasoning as it may encourage others to invest.
$2000 minimum is quite reasonable as a bet given your background, plus the quality of the video provided.
Video content is one of EA’s weaknesses. I also imagine this work could likely receive further funding if the first video or videos were done well with would increase its impact.
One thing that would increase my optimism about this project would be a plan to get people from watching these videos to potentially taking action.
Chris Leong
about 1 year ago
@alexkhurgin I offer to purchase an impact certificate at the default price. Open to negotiating. I mostly selected the default because I’m new to this funding mechanism and I’m still a bit confused by it.
Chris Leong
about 1 year ago
This is actually a really cool idea which might help people form estimates and convince more people to think about these risks. One worry I always have with projects like this is in relation to maintenance and how much continual updating a project like this would require.
Chris Leong
over 1 year ago
Thanks so much for your support!
Oh, is the minimum locked once you create a post? I was tempted to move the minimum down to $700 and the ask down to $2000, but then again I can understand why you wouldn't want people to edit it after someone has made an offer as that is ripe for abuse.
In terms of why I'd adjust it: I'm trying to figure out what would actually motivate me to try to produce more of this content and not result in a bit of extra money in my pocket without any additional content production. I figure that if there's a 20% chance of a post being a hit, I'd need at least funding for a week* in order for it to be worthwhile for me to spend a full day writing up a post (as opposed to the half-day that this post took me).
In terms of the $2000 upper ask limit, I'm thinking it through as follow: It seems that if someone was able to write ten high-quality alignment posts in a year (quite beyond me at the moment, but not an inconceivable goal), then that'd work out at $20k, and it might be reasonable for writing such posts to be a third of their income.
(PS. I decided to do a quick browse of highly upvoted posts on the alignment forum. It seems that quite a high proportion of highly upvoted posts are produced by people who are already established researchers/phd students, such that if there was a funding scheme for hits** and that scheme was aiming to avoid double funding people, the cost would be less than it might seem).
Anyway, would be great if I could edit the ask, but no worries if you would like it to remain the same.
* My current burn rate is less b/c I'm trying really hard to save money, but this is a rough estimate of what my natural burn rate would be.
•• Couldn't be based primarily on upvotes because that would simply result in vote manipulation and distort people towards writing content that would receive upvotes.
Chris Leong
over 1 year ago
Funnily, enough I was going to reduce my ask here, but I hadn't gotten around it yet, so now it may look like it's in response to this comment when I was going to do it anyway.
Chris Leong
over 1 year ago
You should probably write about how you are and how your participation would benefit AI Safety.
Chris Leong
over 1 year ago
Hey Felipe, I'm currently doing community building at AI Safety Australia and New Zealand and I'm quite interested in decision theory (currently doing an adversarial collaboration with Abram Demski, a MIRI researcher on evidential decision theory). Would be keen to hear if you end up in Australia.
Chris Leong
over 1 year ago
I would be really excited to see the establishment of an AI safety lab at Oxford as this would help establish the credibility of the field which is one of the core problems holding alignment research back.
That said, I suspect that a proper research direction is crucial when establishing a new lab as its important to lead people down promising paths. I haven’t evaluated their proposed directions in detail, so I would encourage anyone considering donating large amounts of money to do so themselves.
Disclaimer: Fazl and I were discussing collaborating on movement building in the past.
For | Date | Type | Amount |
---|---|---|---|
Run a public online Turing Test with a variety of models and prompts | 11 months ago | user to user trade | 250 |
Educate the public about high impact causes | about 1 year ago | user to user trade | 224 |
Manifund Bank | about 1 year ago | deposit | +500 |