Killer Robots Are Perrrrfectly Safe
Like purring kittens. Maybe purring saber-toothed tigers. But AGI risk ≠ killbots.
I don’t want to make light of this. I’m hugely in favor of international treaties limiting autonomous weapons. If the idea of a machine deciding to take a human life gives you the heebie-jeebies, that is A+ moral intuition.
I picked on a Foreign Affairs article two weeks ago that called AGI a delusion. Today I want to talk indirectly about another Foreign Affairs article from last year that one of you pointed me to. This one has the advantage that the authors at least know things about the subject matter they’re talking about. They argue that autonomous weapons are the future of war and America needs to get with the program. I’m not thrilled with how they’re actively normalizing killer robots and undermining the ick factor. But since I know nothing about the military, I shall refrain from running my mouth off about that, other than to point to laudable ongoing attempts to regulate autonomous weapons.
My actual point here is that literal killer robots as they exist today are almost completely orthogonal to the risks of AGI. AI with lethal weapons sure sounds like the kind of thing we should freak out about. Does that not blatantly facilitate the AI’s ability to subjugate humanity? My answer is no, that’s not the right way to think about AGI risk. The risk, as I see it, is an AGI that bootstraps itself to a superintelligence that gradually crowds out and disempowers humanity without ever using overt violence. If an AGI ever did something so ham-fisted as to use killer robots against humans, humanity would be galvanized and coordinate to shut it all down. An actual AGI would likely have a self-preservation drive, consciously or not, and would foresee that.
(This is known as instrumental convergence. Building an AGI means giving it goals, effectively making it want things, and no matter what those goals are, it can’t achieve them if it gets turned off. So “don’t get turned off” is an instrumental sub-goal that it naturally converges on.)
Maybe giving killbots to an AI is even a nice honeypot? Like giving a potential enemy a fake gun to see if they try to shoot you with it. If AI has a murderous streak — not counting the usual version in warfare that, somehow, to humanity’s shame, is still fully normalized — then we really want to find that out as soon as possible. Ok, this isn’t likely to actually be helpful.
My serious point is that if the AI’s ability to annihilate humanity is so precarious as to depend on the availability of killer robots, that’s not the superintelligence we’re worried about.
Again, there are plenty of more mundane reasons to oppose killer robots, both philosophical and practical, like foreign enemies hacking them. But if that Foreign Affairs article convinces you killbots are worth it, so be it. I just want to be clear that it’s all a purely pre-AGI debate. Post-AGI it’s irrelevant. We either solve the alignment problem before reaching AGI or we’re equally screwed with or without killer robots.
Maybe the real killer robots are…
I know Leopold Aschenbrenner’s Situational Awareness, which I reviewed three weeks ago, resonated with many of you. (In that review I breezed over the extensive military discussion in the book; it’s about AGI itself as a superweapon, not killer robots per se.) In particular, Aschenbrenner talks about three kinds of people when it comes to AGI forecasting: the naive doomers, the AGI realists like himself, and the blind/evil accelerationists. I first want to reemphasize that the realists like Aschenbrenner are surprisingly doomy, though perhaps wisely rejecting that term. The risks are real, including the risk of losing control of AGI altogether, and the debate is about how to mitigate those risks. So I was disturbed to learn from Scott Alexander this week that the blind/evil accelerationists, led by Marc Andreessen, have put together an unprecedentedly large ($200 million and counting) superPAC aimed at fighting any and all AI regulation.
For everything but AGI I’m an accelerationist myself so I sympathize. To a point. But as far as I can tell, none of these people are engaging with the arguments for caution at all, other than to mock them. I believe my AGI Delusion Delusion post captures the level of discourse pretty faithfully. That plus the straight-faced argument that if AI did displace humanity then it must’ve deserved to. If that’s not enough to send you into a murderous rage then, well, let me know and I’ll dedicate a future AGI Friday to how inhumanly wrong it is.
Random Roundup
OpenAI released a ChatGPT-infused browser called Atlas. I don’t recommend it at this point. Maybe hold out for Claude’s browser extension instead.
Scott Alexander’s AGI forecasting based on the Book of Revelation is hilarious. I saw the live version, which was especially entertaining, but the blog post version captures it well.
AI superstar Andrej Karpathy argues that AGI is coming but not in the next few years. This matches my own intuitions.
I’ve signed the Statement on Superintelligence which is kind of a watered-down version of “if anyone builds it, everyone dies” (see my meta review about the book by that name from last month).
Thanks to almost entirely human Christopher Moravec for the observation that killer robots and AGI risk are orthogonal problems, and 100% human Bethany Soule for copyediting. GPT-5 had a couple decent ideas.

