Manifund foxManifund
Home
Login
About
People
Categories
Newsletter
HomeAboutPeopleCategoriesLoginCreate
johannesCmayer avatarjohannesCmayer avatar
Johannes C. Mayer

@johannesCmayer

AGInotkilleveryoneism

https://www.lesswrong.com/users/johannes-c-mayer
$0total balance
$0charity balance
$0cash balance

$0 in pending offers

About Me

See https://www.lesswrong.com/users/johannes-c-mayer

Outgoing donations

Lightcone Infrastructure
$200
over 1 year ago

Comments

Ambitious AI Alignment Seminar
johannesCmayer avatar

Johannes C. Mayer

5 days ago

Having read most of the proposal it seems like you don't have a model about how to make people good at alignment research.

You write things like "deep technical engagement" . I expect you mean studying existing literature.

I expect this won't work. Or at least not work reliably. Part of what makes the alignment problem had is that we don't yet have a good model of the problem. To make progress on alignment you need to have the ability to notice your confusions, to think through your confusion, and to not give up until you have achieved some level of clarity.

You need to be able to handle situations where there is no obvious next step. You need to know how to pick up the problem and look at it from different angles until after some significant effort of analysis you actually manage to make progress, instead of dropping the ball early.

When I tried to teach people how to do alignment research The main problem I run into is that I don't manage to get them to seriously try. That is, to seriously try to solve the actually hard problems.

Either they get distracted by some cool tractible but ultimately inconsequential problem, or they try to run off reading the sequences, all of what Vanessa work, study math, etc.

Of course I'm not saying that reading the sequences, learning math, or reading other peoples work is inteinically bad. It's bad here because it's used as an escape mechanism. It's easier to study linear algebra than to try to make progress on alignment. Learning linear algebra might be hard but at least the path is clear.

There are probably many more important skills I didn't list, that are necessary to become an effective alignment researcher.

The problem: Your proposal doesn't even try to point at this set of skills at all. I expect you're not going to even try to teach this skill-set, because you can't, because you don't have a model of what they even are or how to teach them.

Now all that said, all else equal, I expect this project is good to do. It is just that I would be much more excited about the project if you could lay out clearly a model of what kind of mental procedures are required to make process on alignment, and how to entrain these procedures effectively.

This is a hard problem and I wish more people were to seriously think about it. Especially the people who run events like this, meaning any events with the goal of making people capable AI alignment researchers.

The only person I know who seriously thought about this and then tried to implement his model, on a semi-large scale, is John Wentworth. I think he got a lot right in his MATS program stream, but it also feels like he only tried to teach a fraction of the skills necessary.

Transactions

ForDateTypeAmount
Lightcone Infrastructureover 1 year agoproject donation200
Manifund Bankover 1 year agodeposit+200