(Note: This is a new 501c3 I will be running, rather than being fiscally sponsored by the Center for Applied Rationality. Many of Lightcone’s staff will be moving with me, and others will contract with us from CFAR at least as we get started.)
I think the best predictor for the work that will happen at the new 501c3 is the work I accomplished with my team at CFAR.
Over the past 7-ish years we’ve provided a large fraction of the infrastructure available for people explicitly trying to prevent existential risk and people working towards long-term flourishing for humanity.
I’ll elaborate on the following below, but, in brief, during that time we:
Revived LessWrong, started the AI Alignment Forum, and grew them to many thousands of active researchers and writers. We also open sourced the software for the forum, which is now used by the EA Forum and the Progress Forum (and Sam Harris’s “Waking Up” forum).
Created some of the most popular formats for AI x-risk retreats and conferences adopted by the EA Coordination Forum, the Existential Security Summit and many other meetings in the EA/Rationalist/x-risk space
Founded MLAB (Machine Learning for Alignment Bootcamp) together with Redwood Research, and ran the Icecone fellowship for students wanting to work on AI x-risk
Created a popular format for in-person office spaces that heavily influenced Constellation and FAR Labs
Built the Lighthaven campus, a renovated 35,000 sq. ft. hotel property where we host conferences, researchers and training programs, which after <1 year of operations is already covering 70% of its costs on ~45% utilization, and has had a large positive impact on programs like MATS, Manifest and other in-person meetings/fellowships
Built the S-Process (together with SFF) which distributed over $80M of donations over the last 5 years. We also more directly ran Lightspeed Grants, which distributed ~$8M
Did a whole lot of miscellaneous stuff around filling in various gaps in AI x-risk community infrastructure and governance, including investigations into what happened with FTX, other future potential malefactors, helped lots of projects get started, helping people navigate the funding landscape, provided fiscal sponsorship for a bunch of young projects, helped run events and participated in a lot of important conversations and discussions in various ways
LessWrong.com was originally founded by Eliezer Yudkowsky in 2009, however he and others at MIRI stopped maintaining it around 2012 (with a gradual decline in software development and moderation efforts until 2015).
By 2016 the site was basically dead, as a result of outdated and inadequate forum software and lack of dedicated moderation staff, with ~25% of the activity it had at its peak in 2015. In 2017, I (Oliver Habryka) decided to rebuild the LessWrong forum software from scratch and launched in late 2017.
I then recruited a team which over the past 7 years built the software behind LessWrong into what seems to me by far the best forum software for intellectual progress on the internet. Using that software, LessWrong has grown around 3-9x on most activity metrics since 2018.
Here are some screenshots from our analytics dashboard:
In 2018 we also started the AI Alignment Forum, a subset of LessWrong focused on discussion and research about AI existential risk and the technical problems around AI alignment. Activity on the AI Alignment Forum is included in the above graphs.
Pure user activity is of course not sufficient to establish that we have created much value. To do that we need to look at the actual content that was produced and the impact that had.
Overall, I think it is accurate to summarize that (especially before 2022) the vast majority of public thinking on AI x-risk has so far taken place on LessWrong and the AI Alignment Forum. I think basically everyone who is currently working on AI x-risk has had their strategy majorly influenced by our work on LessWrong and the AI Alignment Forum, and I think this influence has overall been substantially for the better.
LessWrong, despite often not being directly cited, has had a large impact on policy conversations. It has very frequently served as the generator of important concepts and memes which then spread far outside the social network classically associated with LessWrong. Some concrete cases:
Dominic Cummings talking about the impact of LW-posts on UK government COVID response
Matt Clifford, CEO of Entrepreneur First and Chair of the UK’s ARIA recently said on a podcast [emphasis mine]:
"Jordan Schneider: What was most surprising to you in your interactions during the build-up to the summit, as well as over the course of the week?
Matt Clifford: When we were in China, we tried to reflect in the invite list a range of voices, albeit with some obvious limitations. This included government, but also companies and academics.
But one thing I was really struck by was that the taxonomy of risks people wanted to talk about was extremely similar to the taxonomy of risks that you would see in a LessWrong post or an EA Forum post.
I don't know enough about the history of that discourse to know how much of that is causal. It's interesting that when we went to the Beijing Academy of AI and got their presentation on how they think about AI risk safety governance, they were talking about autonomous replication and augmentation. They were talking about CBRN and all the same sort of terms. It strikes me there has been quite a lot of track II dialogue on AI safety, both formal and informal, and one of the surprises was that that we were actually starting with a very similar framework for talking about these things."
Patrick Collison talking on the Dwarkesh podcast about Gwern’s comments on LW and his website:
"How are you thinking about AI these days?
Everyone has to be highly perplexed, in the sense that the verdict that one might have given at the beginning of 2023, 2021, back, say, the last eight years — we're recording this pretty close to the beginning of 2024 — would have looked pretty different.
Maybe Gwern might have scored the best from 2019 or something onwards, but broadly speaking, it's been pretty difficult to forecast."
When the OpenAI board fired Sam Altman, Gwern’s comments on LessWrong were very widely linked as the best explanation of the dynamics that were going on behind the firing.
Lina Khan (head of the FTC) answering a question about her “P doom”, a concept which originated in LessWrong comments: https://twitter.com/liron/status/1723458202090774949
Since 2019 we have run an annual review in which we identify the best content written on LessWrong and the AI Alignment Forum, the results of which can be found here: https://www.lesswrong.com/leastwrong?sort=year
Some of the posts which seem best to me, with some relevant quotes:
Paul's research agenda FAQ by Alex Zhu
Reading Alex Zhu's Paul agenda FAQ was the first time I felt like I understood Paul's agenda in its entirety as opposed to only understanding individual bits and pieces. I think this FAQ was a major contributing factor in me eventually coming to work on Paul's agenda.
— Evan Hubinger (now working at Anthropic leading the Alignment Stress-Testing team)
Embedded Agency by Scott Garrabrant and Abram Demski
[After reading this post] I actually have some understanding of what MIRI's Agent Foundations work is about.
— Rohin Shah (working on the Deepmind AI Safety team):
This post (and the rest of the sequence) was the first time I had ever read something about AI alignment and thought that it was actually asking the right questions.
— John Wentworth (alignment researcher leading his own team)
Risks from Learned Optimization by Evan Hubinger
This sequence (and associated paper) was the first proper writeup about inner alignment, which are load-bearing for many alignment researcher’s concerns about AI risk.
...it brought a lot more prominence to the inner alignment problem by making an argument for it in a lot more detail than had been done before… the conversation happening at all is a vast improvement over the previous situation of relative (public) silence on the problem.
— Rohin Shah
AGI Ruin: A List of Lethalities by Eliezer Yudkowsky
This post is IMO the best summary of the core difficulties of the AI Alignment problem in one place. While I don’t have a neat quote to attest to its impact, it is one of the most frequently linked articles in AI Safety and had a very large impact on the field (and is the highest karma post on LW).
Where I agree and disagree with Eliezer by Paul Christiano
Paul Christiano’s (head of the Alignment Research Center) response to the above article by Eliezer. The two frequently get cited together (and this is the second-highest karma post on LW).
Let’s think about slowing down AI by Katja Grace
This post preceded (and maybe precipitated) a very large shift in the strategy of many people working on AI existential risk, in what seems to me to be a positive direction. Since this post came out, many of the people I most respect in the field have shifted their primary strategy to one that is trying to delay the development of existentially dangerous AI systems, and many of those people seemed to have been influenced by this article.
Of course LessWrong is not only about existential risk from AI. Some good posts written on non-AI topics include:
Strong Evidence is Common by Mark Xu
I really like this post. It's a crisp, useful insight, made via a memorable concrete example (plus a few others), in a very efficient way. And it has stayed with me.
— Joe Carlsmith (Open Philanthropy researcher on the worldview investigation team
How factories were made safe by Jason Crawford
The go-to suggestions for pretty much any structural ill in the world today is to "raise awareness" and "appoint someone". These two things often make the problem worse. "Raising awareness" mostly acts to give activists moral license to do nothing practical about the problem, and can even backfire by making the problem a political issue. [...]
So what was different with factory safety? This post does a good job highlighting the two main points:
The problem was actually solvable
The people who could actually solve it were given a direct financial incentive to solve it
This is a good model to keep in mind both for optimistic activists who believe in top down reforms, and for cynical economists and public choice theorists. Now how can we apply it to AI safety?
— Jacob Falkovich (long-time blogger and author of “Seeing the Smoke”)
My computational framework for the brain by Steven Byrnes
An IMO very impressive attempt at creating a complete model of how the brain performs high-level cognition, written by Steven Byrnes, staff researcher at Astera.
Simulacra levels and their interactions by Zvi Mowshovitz
I found the concept of “Simulacra levels of conversations” to be a really crucial tool for understanding how to think about large-scale coordination, including things like an AI pause/summer. A quick summary from Zvi himself:
Level 1: There’s a lion across the river.
Level 2: I don’t want to go (or have other people go) across the river.
Level 3: I’m with the popular kids who are too cool to go across the river.
Level 4: A firm stance against trans-river expansionism focus-grouped well with undecided voters in my constituency.
These posts are of course only a very small fraction of the high-quality posts on LessWrong. If the reader has any doubt about the volume and quality of thinking on Lesswrong, I encourage them to check out the Best of LessWrong page, as well as any tag pages on topics that you feel comfortable judging the quality on. Some tag pages of this type: Iterated Amplification, Forecasting & Prediction, Coordination / Cooperation, Anthropics, Efficient Market Hypothesis, Geometric Rationality, and many many more.
In 2021 we ran the Sanity & Survival Summit, a very highly rated retreat for a lot of what I consider the best thinkers in the AI x-risk space. It was oriented centrally around memos instead of presentations or talks. We followed this up with Palmcone in 2022, which followed a similar structure. This structure was then subsequently copied and adopted for a large number of EA and AI x-risk-adjacent events, including the last three EA Coordination Forum events, and the last two Existential Security summit events organized by CEA.
In 2021 we founded the Lightcone Offices, the first large-scale office space for organizations working in the x-risk and EA space. We had a dedicated ops-team which would provide broad operations services to the individuals and organizations we hosted, an application system for individuals and organizations wishing to be admitted to the space, combined with actively running fellowship and research programs that utilized the confluence of people in the space. Constellation adopted a lot of our approach and system and has grown to be the largest space of its kind: they run it as a community platform with temporary programs letting visitors interact with the full time members, members have individual slack channels for making requests, they have used (with permission!) some of our event descriptions, we designed a lot of their office spaces for them in the first year and they make use of access management software we wrote (and we own ~1% of their impact equity via a complicated related agreement). FAR Labs also adopted a very similar model after we shut down.
I am bringing up these two instances less to claim that they specifically were enormously impactful, but more to illustrate that we’ve done a lot of active invention in a very wide range of infrastructure projects. We’ve generally taken a strong “do whatever needs doing” attitude towards infrastructure, and often worked on projects without credit, being generous with our time and efforts to other collaborating organizations in the space. I think a lot of projects in the AI x-risk infrastructure space owe a good chunk of their success to the groundwork we laid.
One key difficulty we ran into while managing Lightcone Offices was having people with different levels of context and background in the same space. A major component of the value proposition behind the Lightcone Offices was to help people new to thinking about AI x-risk orient, find mentors, and try their hand at alignment research. This involved organizing a good number of fellowships and events.
However, structuring the office such that it would facilitate interactions between new and established people working on existential risk was quite hard. Established people would report a sense of being overwhelmed with new people when we ran programs, and we also saw a bunch of young people putting established people up on a pedestal that created unhealthy status dynamics.
One key advantage of the Lighthaven property (and this is the only venue in Berkeley where we found this to be true) is that it naturally allows us to run events and fellowships in separate buildings from the rest of the campus, while still allowing a healthy level of interaction between people in other areas, and not disrupting the workflows of people who are more established. Separate buildings with a shared courtyard seems to me to be a lot better for enabling a healthy level of separation than having multiple floors of an office building (which I think would mostly prevent people from running into each other) or having a single large office floor with a shared kitchen and common area.
Also, I think the venue is just a really great working and thinking environment, getting as close as I think is possible within the Bay Area to being at the intersection of “a secluded slightly paradisiacal oasis” and “close to where all the people are and being easy to access”, and this definitely has a large effect on the quality of the Lightcone team’s work here, and it’s the most common reason people cite for wanting to work from here and run events here.
There are also various other advantages to owning a property and developing it ourselves, compared to renting it. Some discussion of that here.
Since we finished construction, we have hosted:
Three cohorts of the SERI MATS program (100+ scholars)
John Wentworth (and his collaborator David Lorell), Adam Scholl, Aysja Johnson, Genesmith (founder of Bootstrap Bio), and kman (cofounder of Boostrap Bio) as permanent resident researchers
Hosted the Manifest 2023 conference (where our venue received a 9.2/10 average rating)
and many dinners, parties and team retreats.
From these services, we’ve made ~$1.1M in revenue over the last nine months, and my guess is that within a year or two the campus will mostly fund itself on the basis of such service fees (and potentially run a surplus that can fund our other programs, though that is definitely more uncertain).
Since 2017, we have helped organize over 5,000 meetups via the LessWrong and SSC meetup system we built. We have organized over 50 events ourselves with ~6,000 person-hours spent at those events, and have generally provided extensive technological and operational support to local EA communities and meetups around the world.
During the pandemic we pivoted to focus on online events, and built an online venue and permanent social hangout space called The Walled Garden, which was used by many external groups to organize their own events (~30 local groups + people at CHAI, MIRI and AI Impacts). Our tracking says over 4,000 user-hours have been spent in the Garden, and is still occasionally used for things like remote SERI MATS workshops and various ACX and LW meetups.
Also, here are some other projects we’ve executed in this domain:
We built and provided a bunch of infrastructure for the 1,000+ meetups for the annual SSC/ACX Everywhere meetup event.
We helped organize the 2020, 2021, 2022 and 2023 Secular Solstice celebrations
Each year, we help local communities run Petrov Day celebrations. Jim Babcock devised a Petrov Day ritual (at petrovday.com) prior to joining LessWrong, and updates it as part of his work at LessWrong.
While running the Lightcone Offices we also co-founded the Machine Learning for Alignment Bootcamp with Redwood Research and ran the 2022-2023 Icecone program.
In December 2021 we organized the Icecone Visiting Fellowship, which involved inviting ~60 of the most promising undergraduates and graduate students we could find through our student group connections, and having them spend two weeks in Berkeley learning about AI alignment and rationality, and to meet people working on existential risk reduction.
My sense is that many of the attendees found it extremely valuable. Many of the attendees I talked to said it had a big effect on them, and many of them went on to work at a bunch of well-established organizations in the X-risk space, including the UK AI Safety Institute, CEA, SERI MATS, CAIS, Palisade Research and Open Philanthropy.
Overall feedback was also highly positive:
I have less direct analytics for MLAB, though my sense is it was similarly impactful, though also with more indirect capabilities externalities (I know of a bunch of people who ended up working in capability roles, or what seem to me safetywashing roles at major AI capability companies as a result of the bootcamp). MLAB also sparked a bunch of spinoff programs, using the format and curriculum, including WMLB and ARENA.
Since early 2021, Lightcone has been operating under an “end-to-end ownership” model of community building, where we started taking responsibility for the success of the overall AI Alignment/Rationality/EA/Longtermist community, and proactively looking for work that seemed important, but nobody seemed to be working on.
This has caused us to pick up a lot of relatively unglamorous and stressful work oriented around “trying to dig into abusive community dynamics” and “sharing information about bad actors”.
Some of this work is by its nature confidential, which makes it hard to go into details here. I do think in expectation this has been quite important work, and here are some parts I can easily share:
Ben Pace and I spent 200+ hours investigating the collapse of Leverage Research in 2021. This involved 15+ interviews with past Leverage employees and people close to Leverage, a bunch of longer conversations with Geoff, and (with my personal funds) putting out a prize for information relevant to the collapse.
I’ve shared the results of this investigation privately with some people who considered getting more involved with Leverage. We still have a draft of a more public post that we might publish if it ever looks like Leverage Research will end up in a more influential position again, but currently things seem to be declining in a way that makes it not clearly worth it to invest an additional 100 hours or so to get the post ready to publish.
We conducted an investigation into Nonlinear based on multiple reports by past staff of abusive dynamics and bad experiences. Unlike Leverage, Nonlinear is increasing in prominence in the x-risk community. You can see the results of the investigation here: https://www.lesswrong.com/posts/Lc8r4tZ2L5txxokZ8/sharing-information-about-nonlinear-1
I facilitated a series of dinners and did 15+ interviews with people about the collapse of FTX and the role that the EA/Rationality/AI-Alignment/x-risk ecosystem played in that collapse. I’ve shared some of my findings in comments on the EA Forum early in 2023, and have shared other things more privately with people who are doing their own investigations.
Sadly, due to confidentiality requests it is kind of hard for me to write up a lot of my findings publicly, though I have contributed my findings whenever I’ve seen people have relevant conversations in public.
Our budget for the next year is really just operating expenses for our core programs.
Roughly 60% of our forward operating budget is allocated to LessWrong and related online infrastructure, 30% to Lighthaven's ongoing operations and maintenance (offset by ~$1.6M in expected revenue), and 10% to management overhead and legal/financial administration, for a total budget of around $2.7M.
Breaking this down:
$1.4M – LessWrong & online infrastructure
$0.9M – Lighthaven
$0.4M – Overhead, legal & misc
$2.7M – Total
Breaking it down by expenditure type:
$1.3M – LessWrong core staff
$0.1M – LessWrong hosting & software subscriptions
$0.3M – Lighthaven core staff
$1.1M – Lighthaven maintenance contractors & supplies
$1.1M – Interest & projected property taxes
$0.4M – Overhead, legal & misc
-$1.6M – Lighthaven projected revenue
$2.7M – Total
Our basic plan for impact is to reduce risk from artificial intelligence, mostly by building infrastructure for people who do good work in that space, and by iterating on the culture and methodologies of the community we think is likely to make progress on reducing existential risk.
I am not committed to any particular detailed gameplan. That said, here is something like my current best concrete plan.
Build the world’s best online research platform to shape the incentives and facilitate effective information exchange between people working on existential risk
Build and run a best-in-class campus in Berkeley, purpose-built for events with small group conversations and uninterrupted individual reflection
(Maybe found a research institution in something like the intellectual tradition of the recently shuttered Future of Humanity Institute)
Of the projects Lightcone works on, you are probably most familiar with LessWrong. In 2017, LessWrong was basically dead, and I think the default path without Lightcone’s work on it would have been traffic and activity going to approximately zero. About 4 months after I started working on LessWrong full time, Scott Alexander wrote a comment in August 2017 predicting LessWrong wouldn’t be resurrected. Nate Soares also bet against me on whether or not LessWrong 2.0 would work. He later conceded the bet in 2021 (4 years later), and made a public note in 2022.
The funding we’re asking for is both to keep LessWrong going, and for us to work on more infrastructure that’s as valuable to the x-risk ecosystem as LessWrong. Part of that will be built on top of our campus in Berkeley, but it will also involve proactively looking for what is needed, which might be a new project or might involve transforming LessWrong in a big way, or scaling it up by another 3-9x. It took a while for it to be clear that LessWrong 2.0 had worked. (Nate writes in the note above that the results looked “pretty ambiguous” for a year or two). New projects might also take a while, and part of this grant is paying for the search process for “the next LessWrong”.
Ultimately I am planning to build whatever infrastructure will most help people who are successfully reducing existential risk and are creating a flourishing future for humanity.
Expanding a bit more on each component of our plan:
LessWrong.com and the AI Alignment Forum are the central ways we are pursuing the first goal. LessWrong and the AIAF mostly have “product market fit” in the startup sense of that term, and have experienced substantial and consistent growth over many years, which at least tells us that we are likely having some kind of relatively large impact (though how positive that impact is is somewhat less clear).
At a daily level, this means we make improvements to the software, build relationships with authors and assist them in ad-hoc ways, and talk to readers and commenters to understand what problems they have with the product, and then fix those problems. In that sense development on LessWrong and the AI Alignment Forum is pretty similar to development for a software startup.
We also take a bigger picture perspective on the overall health of the community we are building on LessWrong, and try to predict long-term trends that could substantially harm intellectual progress on the site. Most recently this has meant trying to grapple with the overwhelming volume of AI content, and with people who are primarily joining out of an interest in AI (not even necessarily AI safety), which is causing large cultural shifts. Some of these are good and some of which are bad, but which we have to adjust the site infrastructure for and has caused us to increase our moderation standards a lot. It would otherwise seem very easy for the forum to fall prey to the “Eternal September” effect when we experience this much growth in a short period of time.
We are pursuing the second goal through running and managing the Lighthaven campus. During our work on LessWrong, we repeatedly found cases of intellectual progress bottlenecked not on better features or a better online community, but on high-bandwidth communication that is very hard to achieve without being in person.
I’ve attended hundreds (and run dozens) of events in the x-risk community, and venue quality seems to me to substantially affect their quality. Some events are best run in office spaces. But other events involve people sleeping on-site or benefit from lots of dynamically-forming small group discussions. For the latter kind of event, the CFAR Venue in Bodega Bay and the SSS Ranch (that we found for the SSS retreat) are the best non-Lighthaven Bay Area venues that I know of.
These are both about a 90 minute drive from downtown Berkeley and quite remote, which is a defeater for many events or attendees. It’s also easier for participants to find each other at Lighthaven than at SSS and you can host larger events at Lighthaven than at the CFAR venue.
I am considering founding an “FHI of the West”. This would be a direct research arm of Lightcone where we hire researchers and run fellowships pursuing open-ended research on important considerations for humanity’s future. This seems particularly valuable given the recent shuttering of FHI, and my sense is Lightcone is quite well-placed to run something like this. See this LW post for some related discussion.
Bostrom has expressed interest in being part of this (though hasn’t formally committed to anything), as have Daniel Kokotajlo, John Wentworth, Abram Demski, and many others who filled out our “FHI of the West” interest form. I think we could probably create the world’s best place for that kind of work, especially if we leverage both LessWrong and Lighthaven to create a great research and publishing environment (though, to be clear, I would want many publications to be published more broadly than just on LW, though I expect LW to play a key role in the intellectual environment of such an org).
This project is definitely in its early stages, and I am not making any promises we will work on this, though I do think it’s among the top candidates for new initiatives we might want to start. It also seems useful to give a sense of the kind of project we are considering for the future, even if our plans for those projects are still in flux.
Over the past ~4 months, we have been incubating an AI-for-coordination project called Chord. It currently consists of one full-time employee. Its long term aim is to build technologies that enable people to coordinate on important goals like not building AI or agreeing on acceptable and effective regulations.
Chord is trying to solve modest problems now and then use user feedback to move towards a product that can help with those more ambitious goals. Work so far has mostly consisted of working on building a coordination workspace / chat app that can be used by a group of people to decide on some joint activity (like which film to watch, how to structure a team meeting or which new tool to buy for a shared workshop).
I think it helps to understand how Lightcone runs when thinking about funding us.
One of the most useful facts to understand about Lightcone is that we are, as far as I can tell, the slowest growing EA-adjacent team. There are organizations and teams that have seen less net growth over the last 7 years, but I can’t think of an organization that has added as few hires (including people later fired) to their roster that now still work there. I’ve consistently hired ~1 person per year to our core team for the six years Lightcone has existed (resulting in a total team size of 7 core team members).
This is the result of the organization being quite deeply committed to changing strategies when we see the underlying territory shift. Having a smaller team, and having long-lasting relationships, makes it much easier for us to pivot, and allows important strategic and conceptual updates to propagate through the organization more easily.
Another result of the same commitment is that we basically don’t specialize into narrow roles, but instead are aiming to have a team of generalists where, if possible, everyone in the organization can take on almost any other role in the organization. This enables us to shift resources between different parts of Lightcone depending on which part of the organization is under the most stress, and to feel comfortable considering major pivots that would involve doing a very different kind of work, without this requiring major staff changes every time.
Another procedural commitment of the organization is that we try to automate as much of our work as possible, and aim for using software whenever possible to keep our total staff count low, and create processes to handle commitments and maintain systems, instead of having individuals who perform routine tasks on an ongoing basis (or at the very least try our best to augment the individuals doing routine tasks using software and custom tools).