Asteroids, Earthquakes, the E-M Spectrum, and Y2K
In which we try on analogies and ways to think about AGI
I have so much more to say about Tesla and Waymo but since little has actually changed since last week, let’s take a break and talk about AGI analogies. I’ve been adding comments to last week’s Turkla post if you’re curious. At this point I’m pretty much 50/50 on whether Tesla is cheating.
Ok, analogy time!
1. AGI Is Like an Incoming Asteroid with Primitive Astronomers
Imagine an asteroid headed towards Earth. Also imagine the field of astronomy so much in its infancy that we’ve just barely reached consensus on heliocentrism. How big the asteroid is, when it will hit, whether it will hit, what damage it might do if it does hit — all of these are wide open questions. We can’t even agree on whether asteroids necessarily burn up harmlessly in the atmosphere.
In that world, even if the scientific consensus were that direct impact was unlikely, it’d be hard to worry about anything else until the question was settled. Or at least until the probability of catastrophic impact was pushed below, say, 1%.
This is why AGI pooh-poohers can simultaneously be probably correct yet crazy reckless. It’s fine to have a strong intuition that AGI will turn out to be safe or that it’s a long way off, but if your strongest arguments are compelling historical analogies or shortcomings of current models, that’s emphatically not enough, given the stakes.
2. AGI is Like the Cascadia Earthquake
I live in Oregon where we’re due for a devastating earthquake. Any decade now it’s going to happen. It’s hard to feel visceral worry about it when the probability of it happening this year is very low. And yet.
I contend that, as with earthquakes, time-to-AGI is surprisingly hard to estimate. Computers keep learning new tricks that we previously predicted wouldn’t be possible without AGI. Each time that happens, it feels like AGI is around the corner. But then you zoom out and notice that this has been happening since the 1960s. In 1968, SHRDLU, in a virtual blocks world, could do this:
Person: Pick up a big red block.
Computer: OK.
Person: Grasp the pyramid.
Computer: I DON’T UNDERSTAND WHICH PYRAMID YOU MEAN.
Person (changing their mind): Find a block which is taller than the one you are holding and put it into the box.
Computer: BY “IT”, I ASSUME YOU MEAN THE BLOCK WHICH IS TALLER THAN THE ONE I AM HOLDING.
Computer: OK.
Person: What does the box contain?
Computer: THE BLUE PYRAMID AND THE BLUE BLOCK.
Person: What is the pyramid supported by?
Computer: THE BOX.
Person: How many blocks are not in the box?
Computer: FOUR OF THEM.
A few decades later was grandmaster-level chess. Another 25 years later and we got language models that could explain novel jokes and diffusion models turning captions into deepfakes. What does this tell us? Is it like this Scott Alexander joke:
They predicted Fidel Castro would die in ‘95, ‘00, ‘05, and ‘10, but they were wrong each time. Therefore Castro will never die.
Again, it’s fine for the historical rate of progress to give you the intuition that we’re another 25 or 50 years away from AGI. Just that the tectonic pressure keeps building. It may keep exceeding what we previously thought would be the breaking point, but that doesn’t mean there is no breaking point. We just genuinely don’t know when it will be.
Now consider the unprecedented amount of money being poured into training compute, not to mention brainpower on algorithms. If 25-50 years would’ve been about right, we’re probably 10x-ing the field of AI’s resources and plausibly (not to say necessarily) compressing that timeline to 2.5-5 years.
3. Intelligence Is Like the Electromagnetic Spectrum
The rationality community has been making this point forever but my friend Christopher Moravec came up with the evocative analogy here.
First, a reminder that people get hung up on whether intelligence is meaningful or, even if it is, whether it can be compressed into one dimension. We can route around this debate by talking about capability. But since we’re not realistically going to start saying “AC” or “supercapability”, just mentally swap in the word “capability” for “intelligence” in the context of AI. Fundamentally, we’re talking about ability to bend the universe to one’s will — nuclear power or driving a car on the moon or vaccines, to name a few highlights for humanity in particular.
Great, so now consider the spectrum of possible intelligence/capability, from cockroach to unfathomable alien superintelligence. Which of these three worlds is the one we live in?
The Joe-Six-Pack-to-Einstein range of that spectrum is like visible light in the electromagnetic spectrum: a very narrow band. As AI advances, there will be some n for which GPT-n to GPT-n+1 leapfrogs it.
Due to being trained on human output, AI will get stuck in that human band and asymptotically approach the top end of it without breaking past. (Or at least new, currently unforeseeable breakthroughs will be needed to to break past.)
AI plateaus before reaching the bottom end of the human band of the spectrum.
I don’t think those worlds are equally likely but all three are still possible. I do think the fundamental narrowness of the human intelligence band is true, making world #1 much more plausible than it may intuitively seem.
4. AGI Is Like Y2K
Remember in 1999 how people talked about Y2K, how the whole financial system would implode because so many old computers stored only the last two digits of the year and so the year 2000 would be treated as “00” which would be treated as the year 1900 and it would all be a disaster of biblical proportions?
And then remember how everything was fine? But the doomsayers were absolutely correct! The reason it turned out fine was because everyone correctly freaked the frack out and we spent hundreds of billions of dollars fixing it just in time. This is known as the Preparedness Paradox. Like how if your security team is so good that you never have an Incident and you think “why am I paying these people so much when we never even have any security incidents?”.
I’m hopeful that that’s how it will go with AI. That as AGI approaches we’ll redirect a good portion of the world’s top minds and however many trillions of dollars it takes to figure out how to encode human values in a superintelligence. (Probably it’s going to be harder than adding two more digits to a date field.) It sure won’t happen by accident or because the superintelligence is so smart that it can just figure out what human values are. Artificial intelligence, and computer programs in general, don’t work like that. They just do exactly what you program them to do which constantly turns out to be something disastrously different than what you meant them to do. When your program is not a superintelligence, those disasters are self-limiting. At worst something blows up or melts down and you go back to the drawing board. When it is a superintelligence, a small error can spiral out of control and literally destroy the world.
And that kind of “be careful what you wish for” problem is arguably the least worrisome version of the AI alignment problem. There are just so disturbingly many ways AGI can go disastrously wrong. And so dizzyingly many plausible timelines for how long it might take to get there.
In the News
Almost Entirely Human celebrates the Fourth of July with hilarious USA maps.
If you wanted an excuse to pay for Astral Codex Ten, two recent subscribers-only posts about AI have been excellent: one on how to make a personalized kids’ book and another giving a walk-through of how Scott used o3 and Claude as a research assistant for a recent post. He calls it “an unprecedented combination of brilliant and mendacious; too useful to avoid but too unstable to fully trust.” (Also the hidden open threads, for random discussion amongst the elite ACX commentariat, are high-quality. I sometimes float ideas for AGI Friday there.)
Stories about ChatGPT driving people literally crazy have been in the news. I think this is mostly silly, for reasons articulated by Andy Masley, but Zvi Mowshowitz argues the case that it’s not entirely silly.
The Overton window is shifting. Congresspeople even sound like they’re taking AGI risk seriously. I listened to much of that congressional hearing myself and there was plenty of “gotta beat China” and “how can we double the capacity of the power grid?” but also plenty of acknowledgment that racing to superintelligence could doom humanity, regardless of who gets there first.
Thanks to human Bethany Soule for making the “shooting past the visible band of the E-M spectrum” illustration. ChatGPT/DALL-E could not get its head around the idea that the rainbow needed to be oriented vertically but compressed into a thin horizontal band.
Here's an analogy that's been on my mind for a while: Maybe AI is like cancer. Cancer cells don't work right, but work well enough to live and spread. The body doesn't recognize them as enemies. And cancer grows vigorously! Similarly, AI doesn't work right. It has glitches and hallucinations when doing serious research and produces dangerous nonsense and falsehoods. And even when it does not, as when it has a glitch-free companionable chat with someone, or produces "creative writing," you get the feeling there's just something *wrong* with the DNA of its communication. And yet it's not giving so much wacked out nonsense about research or saying such weird-as-hell stuff in "conversation" that we break off communication. We don't recoil from it and avoid it, as we might something more clearly alien and toxic. And as for AI's having cancer-like vigor -- more and more internet content is AI generated.
Curious to hear other people's mental models.
Here’s another analogy: LLM is like a baby alien that was the sole occupant of a space ship that crash-landed on earth. We have been feeding and tutoring it ever since it landed. By the time it was 2 years old it was far ahead of human toddlers in many areas. It did not require instruction of the usual sort, just ever-larger samples of human communication, though it was possible for us to train it to do specialized tasks, and to be less likely to do things we thought were undesirable Many of those studying it believe that it will soon reach the alien equivalent of puberty, and that when it does it will be capable of searching out whatever resources it needs for further development, and training itself far more effectively than we have been able to train it. It can’t or won’t tell us much about how its mind works and what it is going to develop into.