Like Chess and Go, All of Math Is About To…

Dec 13

If a task has sufficiently clear success criteria, AI will beat humans at it

2 Comments

Brilliant framing on the 'clear success criteria' angle there. The distinction between AI solving problems versus coaxing it into the lightbulb moment actually gets at something I've been noticing in production enviroments: the feedback loop is everything. Once verification becomes cheaper than generation, we're basically in a diferent game where humans curate rather than create. Honestly makes me wonder if the Aaronson debacle isn't just preview of what peer review looks like when slop is indistinguishable from substance at first glance.

Expand full comment

Reply (1)

Daniel Reeves

Dec 13

Thanks! Well said about the feedback loop.

As for science slop, there are two opposing forces. AI makes it easy to flood the review process with slop, but also makes it easier to do wheat/chaff separation. I don't have a strong prediction of which side will win. Most interestingly, I don't even know which side wins post-AGI.

(Maybe this is yet another AGI alignment disaster scenario... *wavy imagination lines* We've reached AGI and beyond but what the robots are superhumanly good at is gaming the peer review system, which they're on both sides of. Scientific publishing eats the economy without any of the research being about anything a reasonable human could care about. Like continental philosophy. Just kidding. But you can imagine getting locked into an equilibrium where there's no hope of keeping up as a scientist without letting AI write your papers and there's no hope of doing peer review without outsourcing it to AI and the humans end up sidelined altogether and the "scientific" "literature" spirals out of control and into deeply meaningless onanistic arcana.)

PS: By "Aaronson debacle" you mean the theoretical physics paper that Scott Aaronson said looked reasonable at first glance but was convinced by someone who read it carefully and concluded it was worthless? To be clear, Scott Aaronson was just a bystander there. I centered him in my telling because I trust his judgment. Maybe I should edit that so I don't give the impression that he was involved in the paper or was duped by it (I think he meant "first glance" pretty literally!).

Adding to the potential confusion, Scott Aaronson himself recently published a paper in which GPT-5 made a nontrivial intellectual contribution. He blogs about it here: https://scottaaronson.blog/?p=9183

Expand full comment

AGI Friday

Like Chess and Go, All of Math Is About To…