@ms Thanks for the positive signal Saul.
I also can relate to the positive points you mentioned. I think it is a cost-effective use of resources and a surprising number of people seem to have found the alignment problem through the fanfiction pipeline.
Re: the right hands part,
I believe you mean the right people would be people who are likely to enthusiastically enjoy the book, and have the ideas in it affect them
Even if they do not become alignment researchers, I still think it can help them to discover works like Replacing Guilt, and Sequences and apply more rational thinking in their lives. It also helps to find that this community exists out there, it feels nice to belong.
To increase the odds these books are not gathering dust somewhere and are actually read, it helps to target places where the right people are higher in density. I think places where people engage in intellectually stimulating activities, where abstract meta-thinking skills are useful are good places.
I am currently planning to distribute 70% at events held in top universities in India where the competitive exam vets for problem solving skills and gathers smart kids in one place. Then developer events where agentic, curious, altruistic people come to talk about open source software, AI, etc. There are other places like meetups for quantified self, productivity or note-taking apps, which I think are really great spaces.
The remaining 30%, I am planning to go to schools, target younger people who are unlikely to exhibit clear signs of potential but still I would look for agentic behaviour in them by asking the class who the curious kid is, who reads widely, a generalist, etc.
I am excited to talk to Mikhail, I am quite open to adjusting my plans with new information or ideas.
Re: are these the sorts of people that we would want working on alignment?
Before the overton window shifted so much, it made sense to rely on some gate keeping to ensure high signal to noise ratio in alignment research. But I think now we need more robust mechanisms to decide if someone is contributing to the solution or not. The selection pressures by prematurely deciding what kind of people can work on it is likely to yield unacceptable false negatives.
My model is also that we don't have sufficient confident in existing alignment directions that we can commit resources towards exploit and select for people who are making progress in them. Rather we need to still fund explore type research to hopefully make some tractable progress in time.
> are there other books/series/etc that would get more impactful people engaged?
To get people more quickly engaged I would use superintelligence or Brian's Alignment Problem. If enough funders are excited about that I would be happy to use part of the funds for those books too.
HPMOR is not a quick way to get people into alignment if that is the goal. What HPMOR does is get people interested in the rationality community and practice thinking more carefully about risks from such systems. It helps in a more indirect way, communicating the vibe of thinking a certain way.
It will definitely nerd snipe a certain kind of person and the question of if hpmor is worth funding as opposed to other books would depend on what impact means, are these alignment researchers mentioning hpmor doing impactful work? or are they maybe feeding into some sort of deference cascade?
Personally I believe it is a net positive and so I am happy to work on this project but I don't think the answer is cut and dried. People can have well informed views that think hpmor would not have the right kind of selection pressure.
Re: does HPMOR also unnecessarily alienate people to alignment?
I think this also links to the idea of HPMOR being given as a way to push people into alignment. The downside risk of it alienating people comes from feeling manipulated into a certain frame.
This is why I think it is useful to focus on HPMOR as a tool to raise the sanity waterline or increase awareness about ways of thinking. This might lead to the readers being curious about the generator of the work, reading other works, thinking more clearly about AI safety, and wanting to contribute to the field as a side effect. But that would come from within and I don't think it can or should be forced.
It is also true that this pipeline will lead people to discover alignment through the MIRI worldview first and that might bias them. But there is a lot of dissent and other opinions in this space, so even if they find the problem like this, they can engage with and criticize those framing.
But all that risks of alienation comes only if people read this piece of fan fiction, enjoy it enough to go down the rabbit hole of EY's other work. By itself, I do not think the book can piss off anyone.
Even people who dislike or disagree with EY can still enjoy this book and recommend others read it. I believe art stands separate from the artist at least in this case.