AI Agents To Get Useful in 2025
And other upcoming predictions from AI 2027, medical diagnosis already superhuman
I’m working my way through the book-length AI 2027. I may try a book brigade of it (let me know if you’re interested in joining).1 For those just tuning in, the idea behind AI 2027 is to describe a modal scenario — as in, picking the most likely single thread of specific events in order to create a coherent narrative. Reality won't match that thread, of course, but it's the best we can do if we're forced to pick a specific thread. This is helpful for getting our heads around how this could play out. Scott Alexander and the other authors emphasize that it's plausible that everything happens up to 5x slower or (implausibly, in my opinion) faster than this.
Here’s my collection of notable upcoming predictions from AI 2027:
In 2025, AI agents that can click and type on the web start providing actual value
In 2025, data centers are built that can do 10^28 FLOPs of training compute, 1000x GPT-4
In 2025, AI is especially good at helping with AI research
By the end of 2025, AI company revenues triple and the top AI company hits a market cap of a trillion dollars
In 2026, frontier labs cement their lead by using their best internal models to accelerate research, getting 50% faster on algorithmic progress (as opposed to progress from mere scaling)
In 2026, coding automation is to the point of nailing any programming problem with a functional spec
In 2026, AI agents are like scatterbrained employees who thrive under careful management
In 2026, China routes around chip restrictions and scrambles to catch up
In 2026, the stock market surges, driven by Nvidia, AI companies, and companies integrating AI assistants
In 2026, AI has decimated the job of junior developer but those who can manage and quality-control teams of AIs are making a killing
In 2027, we have continuous training as opposed to pre-training of models, so the AI gets smarter by the day
In January of 2027, AI approaches superhuman engineering ability (designing and implementing experiments) and hits the 25th percentile of frontier lab human researchers on “research taste”
In February of 2027, things get crazy with China (geopolitics! espionage!) as the top non-public frontier model approaches superhuman levels of hacking and cyberwarfare
In 2027, all coding is automated and AI research dramatically accelerates
In July of 2027, we hit AGI, have cheap AI remote workers, and all hell breaks loose
The scenario continues, and actually has two possible endings — one good, one bad — but let’s pause here before it gets too sci-fi-seeming, and start with a prediction market for that first one:
The term “agents” in this context refers to tools like OpenAI’s Operator that can use a virtual keyboard and mouse to go do arbitrary tasks on the internet for you. These are more trouble than they’re worth for now. Manifold estimates a 30% chance I’ll find practical use for such agents by the end of this year.
So if this first AI 2027 prediction turns out correct, that will be evidence for taking the rest of these predictions more seriously.
In the news
As reported in two Nature papers — one introducing a conversational diagnostic AI tool called AMIE and another on a study pitting it against doctors — AI is superhuman at medical diagnosis now. That may not even be news. Maybe the news (as claimed in the second paper) is that it’s so superhuman that it’s like chess AI: if you don’t trust it and try to second-guess it, you just make it worse.
I’m also thinking that a book brigade of Leopold Aschenbrenner’s Situational Awareness is in order. If we do this in June — a year after it was published — we can discuss how the predictions are faring so far and decide how seriously to take the ones that are still about the future.
I'm down for a book brigade for Situational Awareness in June!
I wonder which will come first: AI agent models get pretty good at computer use, or a decent number of sites realize they can get a huge advantage by exposing an API that AI agents can use more easily. If Priceline allows Operator to book via API and Booking(dot)com doesn't, I know which one my vacation planner agent will use
These are all intriguing enough to keep an eye on manifold markets for these predictions, especially on the order of magnitude questions.