My P(doom) Rollercoaster
The bears and bulls are giving me whiplash. Also, progress on solving math.
P(doom) refers to the probability of a disastrous outcome from AI. See “Paperclipalypse” or the upper right corner of the Technological Richter Scale from my previous AGI Friday on AI risk. Again, nothing that current models can do is worrisome in that sense. It’s all about extrapolating. We took some steps (at least baby steps) towards AGI in the last 5 or so years. What if we see similar progress in the next 5 years?
In mid 2024 I said “my p(doom) has been inching downward lately, just because we’ve been stuck for a while, relatively speaking, at GPT-4 levels of intelligence”. Whenever I say things like that I rush to add that we don’t actually know anything. I like how Scott Aaronson put it at the time:
We genuinely don't know but given the monster empirical update of the past few years (i.e., “compressing all the text on the Internet is enough to give you an entity that basically passes the Turing Test”), which is probably the most important scientific surprise in the lifetimes of most of us now living ... given that, the ball is now firmly in the court of those who think there’s not an appreciable chance that this becomes very dangerous in the near future.
By the end of 2024 my p(doom) was inching back up again. Take this Manifold market I created in 2023 about AI’s ability to do basic geometric reasoning. It resolved NO on new year’s eve, but it was very close:
The answer now, with the latest reasoning models, is that AI can very much do that.
(To my surprise, this does not seem to include Claude Sonnet 3.7, which otherwise seems to be the smartest of the models at things like coding. And side note, GPT o3-mini-high, as it is somehow named, not only tends to nail these, it’s sometimes cleverer than me at them.)
This still blows my mind. At one level, an LLM is merely repeatedly predicting the most likely next word of a partially completed sentence. It’s just that, in order to make the best word predictions, it needs to, as a side effect, build a world model in its head to actually understand what it’s talking about. We can quibble about what “actually understands” really means, but it’s doing something wild and powerful. And genuinely useful.
It’s occasionally, in some ways, better than me at geometric reasoning. It destroys me at advanced calculus. Or take the FrontierMath benchmark, announced in November. These are problems that professional mathematicians expect their human colleagues to be able to solve only after hours or days of work. No one expected AI to put much of a dent in those problems for years. Months later, AI can already solve over 10% of them. It’s not implausible to me (or to professional mathematicians) that it won’t be many years before any math problem you can write down on a piece of paper that a team of Fields medalists can solve, AI can as well. I would’ve thought that that would require AGI. Of course people used to think that playing grandmaster-level chess would require AGI.
Point being, a week ago I was really feeling the bull case. Then GPT-4.5 came out. I’ve been playing it with it this week and this is not like the upgrade we saw from GPT-3.5 to GPT-4. Which may be a blow to Aschenbrenner’s bull case that we talked about last time.
But it’s too soon for me to tell. So tune in next time for what I think of GPT-4.5 and whether it’s sending the rollercoaster back into a dive.
In the news
If you want much more frequent and much more extensive updates on AI, I recommend Zvi Mowshowitz’s Don’t Worry About the Vase.
My friend Christopher Moravec has some fun deep dives on some impressive things he’s gotten AI to do on his new blog, Almost Entirely Human.
I’ve been talking about Gary Marcus’s bear case but I’m more impressed with Thane Ruthenis’s bear case posted on LessWrong.
And just to counterbalance that one, I’m also impressed with Ege Erdil’s bull case on the Epoch.AI substack.
I have been REALLY struggling to see the difference between a smart omni style model that can continuously receive input (/ produce output) and a conscious human.
Perhaps I am allowing my neurons dedicated to the recognition of other humans, the faculty of communication, override the rational sense that would allow me to assess these intelligences from a structural rather than input / output perspective.
It's disconcerting because I use the words someone else use to judge the contents of their mind. The way someone writes a sentence tells me a lot about how their brain works. I know from a mathematical perspective that LLMs are merely guessing the most likely upcoming token one token at a time, but using this perspective to discredit intelligence doesn't seem right to me as I write this post on a processor that works off of Cape Cod potato chips. Hence, the structural argument to discredit LLMs doesn't hold much water to me, especially glancing at the almost non-existent real multitasking capabilities of humans.
When LLMs talk, sometimes I see sparks that make me think - you're a human. But then the conversation stops and the machine goes back to being a piece of melted sand.
I'll let my bias show, but at this point, I feel very strongly that if I were to setup 4o to run continuously and use memory more smartly, deleting and reinforcing memories where appropriate (especially given OpenAI's conquering of the needle-in-a-haystack problem of pre 4o LLMs), I'd be content with calling that thing a conscious being. However limited it may be.
I wonder what Peter Watts has to say about this. In his book Blindsight, humanity runs into consciousness ant-like superintelligences who rocketed to the top of the evolutionary food chain (far surpassing conscious beings) by not having a conscious brain, instead having hyper intelligent subconsciousnesses. (if I'm not mistaken)
When they enter the solar system, we quickly realize that they do not have even a GPT-1 level of sentence parsing ability, but for some reason these beings have managed to achieve interstellar travel and have expended considerable resources to build a Chinese Room to deal with the human species' radio emissions. To unconscious beings, to deal with parsing the contents of language is almost akin to having to deal with a DDOS. And so one wonders if these beings have come here to wipe out the forms of life that are unable to communicate at a sub-conscious level because their communications seem specifically like a virus meant to waste resources on the parsing of symbols?
I'm not a philosopher and I never took enough philosophy courses, (I wish I did), but stepping back to the structural mold, LLMs work entirely off of tokens. And yet, perhaps what makes us conscious is our usage of symbols for communication, whether that be language, body language, attacks, defense. If we have decided to make machines that trade specifically in the manipulation of symbols, frankly, what do I care about what's manipulating the symbols if their pattern of symbol manipulation so matches what we think of as thought?