Discussion about this post

User's avatar
Charlie Sanders's avatar

Thoughts on the stated chain of thought:

5): There could be unique properties of humans - e.g. the possession of government-issued ID and ability to interface with systems that require that - that gives humans as remote workers a unique selling point over AIs. Abstracting "remote work" is dangerous.

6): What if AI research involves figuring out how to optimize hardware? It seems at least conceivable that there will be the need to build out and integrate novel types of infrastructure of the type that can't be done without physical embodiment.

7): What if it's recursive, but with a logarithmic function shape and a coefficient of 0.000...1? Recursive is a category, not a trajectory.

9): Why does benchmark creation have to be autonomous? Can you point to a single instance of an autonomously created benchmark, ever, in the history of humanity? Have any labs announced plans to no longer create their own benchmarks and to instead automate their creation? Where is this assumption coming from and why?

11): This doesn't follow from the prior postulates. You've snuck in a bunch of highly contentious assumptions (the orthogonality thesis, inherent limits to corrigibility, no meaningful regulation or societal response, etc.) in this step.

12): You've snuck in the assumption that the first AI that publicly schemes is smart enough to get away with it. Just because an AI can eventually get away with scheming doesn't mean that the very first time it's tried it will succeed. Consider - do you extend the assumption of perfect-first-time success to humans trying to detect scheming as well?

13): There are plenty of technologies humanity has collectively shut down, such as CFCs and gene editing. The AI industry is far more consolidated and vulnerable to nationalization than those industries, so it doesn't follow that shutting down development would be impossible.

On a broader level, governments are not NPCs in the the trajectory of Superintelligence. The recent Anthropic-DoW dust-up should make that extremely clear. It'd be worth considering how you've updated your assumptions based on it.

1 more comment...

No posts

Ready for more?