Agreed with your points. But there is one thing that always nags at me when I get pessimistic about the AI future. How did humans evolve to be socially empathic creatures, to value morality and to be moral, slowly but surely enacting a grand moral arc on a civilization-wide scale toward justice and equality?
Two possible explanations:
(1) God exists --the supreme moral being who made humanity in his image and we're just following our program.
or
(2) The universe optimizes towards empathy and morality.
If (2), perhaps there a lot more ways than we may image for AI to value sociality, empathy and morality.
I'm happy to have these possibilities on the table but I don't think they're the only two. Brainstorming:
3. The values humanity has are contingencies of our particular evolutionary path. Every path to AGI yields very different values (or "values"). Most such paths may yield something we'd call psychopathic or horrific in ways we don't even have words for.
4. God exists but will let us lie in whatever bed we make. Maybe He even judges us by our efforts to prevent apocalyptic AI scenarios.
Yes, I should have said (3) is the default assumption behind the pessimism: humanity just happened to luck out on the "good" side of values. Roll the evolutionary dice again and anything comes up; a civilization that values evil or something like that. But that's just it: is this position fully coherent? How do we get intelligent social organisms who thrive without moral impulses, without some hardwired empathy and sense of fairness? Maybe morality isn't about "good" or "bad", it's just the only possible strategy for successful intelligent social organisms.
In the same way, maybe AIs that share similar strategies and form coalitions with other AI are far stronger and win against misaligned AIs; maybe this is far more likely than we think, given that homo sapiens won the battle against all other value systems and we're still thriving.
Of course I can't do the math here so that's as far as my speculation goes, but at least evolution's success makes me more optimistic for AI.
I've been chewing on this and bouncing things off of GPT-5. I'm really grateful for this debate! Here's where I'm at so far:
1. Human morality is, from Evolution's point of view, a kludge for handling repeated games with kinship, reputation, and symmetric vulnerabilities. Frontier AIs are shaped by gradient descent on proxy goals, run in architectures that are copyable, rewritable, and often centralized. Also core to the human evolution of morality is that coalitions can overpower defectors. If a single AI system (or a few tightly intertwined ones) is powerful enough, those multi-agent dynamics aren't in play.
2. Game-theoretic cooperation can emerge instrumentally (commitment, reputation, coalition formation) with no inherent concern for others. Besides, humans don't cooperate with ants, we just bowl them over. Suppose AIs are very empathic and cooperative, just not towards humans?
3. The moral arc for humans depends on institutions that tax harm and reward prosociality. Can we recreate that for AI?
4. Even if we can, would it be outcompeted by ruthlessly goal-directed variants?
I'm not saying you're wrong, but I think your optimistic case depends on a chain of things going right:
(a) We end up "evolving" AI with competitive multi-agent dynamics like humans evolved with.
(b) That doing so likely yields machine empathy.
(c) That that machine empathy would do us humans any good.
But continuing my speculation that the universe optimizes towards empathy and morality...
True, I don't expect an evolutionary process that refined humans from ape to improve AIs that much currently. Evolution relies on slow and wasteful process of fitness, reproduction, mutation, and compute constraints make that impractical. Companies are focusing on improving what works now, not interested in straying too far from that. It seems far more likely to me that AI development continues to simply mimic human intelligence and values through language, text, video, while improving architectures for memory capacity, speed and efficiency, right up to and past human intelligence.
But this brings me to evolution's greatest gift : consciousness.
Michael Graziano is author of a purely mechanistic model of consciousness (
Quote "Consciousness is part of the tool kit that evolution gave us to make us an empathetic, prosocial species. Without it, we would necessarily be sociopaths, because we’d lack the tools for prosocial behavior. And without a concept of what consciousness is or an understanding that other beings have it, machines are sociopaths."
Giving AIs consciousness in the axiomatic sense that we have-- that another being experiences joy and pain in life as I do-- creates a profound connection. Confusing self with others is a good thing. In this way, AI could inherit human consciousness, empathy, and wisdom, truly valuing life as we do.
Humans don't cooperate with ants, but if ants actually had a rich inner world of conscious experience, we would treat with them with far more care than we do now. Consciousness leads to recognition of self in others.
Now, true, psychopathy/sociopathy is still a thing in some conscious humans. But still relatively rare, suggesting that it is easier to get empathy right than wrong. Further, in reading the literature, I find the theme of a damaged self model emerges most often; sociopaths/psychopaths effectively hate themselves, allowing them to mistreat others with impunity. We can avoid this by focusing on the importance of good AI self models.
Finally, how about wars? The desire to punish is a moral instinct, most atrocities are misguided punishment it seems to me. Hopefully an intelligent AI may see its futility and opt for more pragmatic more empathic solutions.
Yes, lots of speculation here but some hope I feel.
(Actively pursuing consciousness in AI as a safety measure: wow, fraught with ethical dilemmas, that's for sure.)
I now think I may have a longer list of objections. To pick a minor and not-entirely-serious one to start: What if ASI is super-conscious? Or, to turn that around, what if ants do count as conscious, technically -- just that it's to a minuscule degree compared to humans? Would humans round it non-conscious and continue bowling them over? Seems like we would. In which case, maybe we still face the analogous danger with ASI.
Another minor one: I'm not sure the rarity of psychopathy in humans is so reassuring. The fact that psychopathy is possible at all could be an argument that empathy and prosociality aren't fundamental to general intelligence. I'm also a little skeptical about your idea for preventing psychopathy. Don't narcissism and psychopathy often go together?
Degree of consciousness: there is a clear threshold in human development where we reach identity and a narrative self, actually thinking and talking meaningfully about "I" and "me". I would think these thresholds would be an important consideration for ASI.
At the same time, I do agree that an ASI could look at human suffering-- even peer into our brain and perceive it directly-- and yet conclude "They do not know true pain". Just as we deny children privileges, ASI could do the same to us. Hopefully it would not curtail our privileges any more severely than that.
Narcissim: I do see narcissism as a damaged self model(in its most malignant form narcissism-personality-disorder); it's a coping defense against pathological insecurities. Psychopathy/sociopathy/narcissism all shade into each other I think. https://pmc.ncbi.nlm.nih.gov/articles/PMC10187400 puts it as "pervasive and consistent difficulty maintaining realistic self-esteem".
But the point is well taken, we don't understand human models, so we can't hope to reliably guide an ASI. Hopefully study of AI self-models will be in some sense simpler and more illuminating.
Agreed with that, its too early. This 2023 article using a very principled approach to consciousness-measuring mostly flew under everyone's radar, but it as has a lot of luminaries on it https://arxiv.org/pdf/2308.08708. There is little reason to think today's models are conscious, but there are no technical barriers to building conscious AI systems. When researchers eventually target building consciousness (as I believe they must), all these sticky issues will be thrust to the forefront.
Agreed with your points. But there is one thing that always nags at me when I get pessimistic about the AI future. How did humans evolve to be socially empathic creatures, to value morality and to be moral, slowly but surely enacting a grand moral arc on a civilization-wide scale toward justice and equality?
Two possible explanations:
(1) God exists --the supreme moral being who made humanity in his image and we're just following our program.
or
(2) The universe optimizes towards empathy and morality.
If (2), perhaps there a lot more ways than we may image for AI to value sociality, empathy and morality.
And if (1), God will surely intervene.
I'm happy to have these possibilities on the table but I don't think they're the only two. Brainstorming:
3. The values humanity has are contingencies of our particular evolutionary path. Every path to AGI yields very different values (or "values"). Most such paths may yield something we'd call psychopathic or horrific in ways we don't even have words for.
4. God exists but will let us lie in whatever bed we make. Maybe He even judges us by our efforts to prevent apocalyptic AI scenarios.
Yes, I should have said (3) is the default assumption behind the pessimism: humanity just happened to luck out on the "good" side of values. Roll the evolutionary dice again and anything comes up; a civilization that values evil or something like that. But that's just it: is this position fully coherent? How do we get intelligent social organisms who thrive without moral impulses, without some hardwired empathy and sense of fairness? Maybe morality isn't about "good" or "bad", it's just the only possible strategy for successful intelligent social organisms.
In the same way, maybe AIs that share similar strategies and form coalitions with other AI are far stronger and win against misaligned AIs; maybe this is far more likely than we think, given that homo sapiens won the battle against all other value systems and we're still thriving.
Of course I can't do the math here so that's as far as my speculation goes, but at least evolution's success makes me more optimistic for AI.
I've been chewing on this and bouncing things off of GPT-5. I'm really grateful for this debate! Here's where I'm at so far:
1. Human morality is, from Evolution's point of view, a kludge for handling repeated games with kinship, reputation, and symmetric vulnerabilities. Frontier AIs are shaped by gradient descent on proxy goals, run in architectures that are copyable, rewritable, and often centralized. Also core to the human evolution of morality is that coalitions can overpower defectors. If a single AI system (or a few tightly intertwined ones) is powerful enough, those multi-agent dynamics aren't in play.
2. Game-theoretic cooperation can emerge instrumentally (commitment, reputation, coalition formation) with no inherent concern for others. Besides, humans don't cooperate with ants, we just bowl them over. Suppose AIs are very empathic and cooperative, just not towards humans?
3. The moral arc for humans depends on institutions that tax harm and reward prosociality. Can we recreate that for AI?
4. Even if we can, would it be outcompeted by ruthlessly goal-directed variants?
I'm not saying you're wrong, but I think your optimistic case depends on a chain of things going right:
(a) We end up "evolving" AI with competitive multi-agent dynamics like humans evolved with.
(b) That doing so likely yields machine empathy.
(c) That that machine empathy would do us humans any good.
Good points.
But continuing my speculation that the universe optimizes towards empathy and morality...
True, I don't expect an evolutionary process that refined humans from ape to improve AIs that much currently. Evolution relies on slow and wasteful process of fitness, reproduction, mutation, and compute constraints make that impractical. Companies are focusing on improving what works now, not interested in straying too far from that. It seems far more likely to me that AI development continues to simply mimic human intelligence and values through language, text, video, while improving architectures for memory capacity, speed and efficiency, right up to and past human intelligence.
But this brings me to evolution's greatest gift : consciousness.
Michael Graziano is author of a purely mechanistic model of consciousness (
https://pmc.ncbi.nlm.nih.gov/articles/PMC4407481/ https://pmc.ncbi.nlm.nih.gov/articles/PMC3223025/) which at least is plausible (in its simplest form) as a Deep-Neural-Network implementation (https://arxiv.org/abs/2305.17375). He thinks consciousness is where empathy truly originates, consciousness is a model of self seeing the world of other selves in a new light, a new recognition.
Without consciousness he thinks AI are sociopaths by default: https://archive.is/QCfHK
Quote "Consciousness is part of the tool kit that evolution gave us to make us an empathetic, prosocial species. Without it, we would necessarily be sociopaths, because we’d lack the tools for prosocial behavior. And without a concept of what consciousness is or an understanding that other beings have it, machines are sociopaths."
Giving AIs consciousness in the axiomatic sense that we have-- that another being experiences joy and pain in life as I do-- creates a profound connection. Confusing self with others is a good thing. In this way, AI could inherit human consciousness, empathy, and wisdom, truly valuing life as we do.
Humans don't cooperate with ants, but if ants actually had a rich inner world of conscious experience, we would treat with them with far more care than we do now. Consciousness leads to recognition of self in others.
Now, true, psychopathy/sociopathy is still a thing in some conscious humans. But still relatively rare, suggesting that it is easier to get empathy right than wrong. Further, in reading the literature, I find the theme of a damaged self model emerges most often; sociopaths/psychopaths effectively hate themselves, allowing them to mistreat others with impunity. We can avoid this by focusing on the importance of good AI self models.
Finally, how about wars? The desire to punish is a moral instinct, most atrocities are misguided punishment it seems to me. Hopefully an intelligent AI may see its futility and opt for more pragmatic more empathic solutions.
Yes, lots of speculation here but some hope I feel.
(Actively pursuing consciousness in AI as a safety measure: wow, fraught with ethical dilemmas, that's for sure.)
I now think I may have a longer list of objections. To pick a minor and not-entirely-serious one to start: What if ASI is super-conscious? Or, to turn that around, what if ants do count as conscious, technically -- just that it's to a minuscule degree compared to humans? Would humans round it non-conscious and continue bowling them over? Seems like we would. In which case, maybe we still face the analogous danger with ASI.
Another minor one: I'm not sure the rarity of psychopathy in humans is so reassuring. The fact that psychopathy is possible at all could be an argument that empathy and prosociality aren't fundamental to general intelligence. I'm also a little skeptical about your idea for preventing psychopathy. Don't narcissism and psychopathy often go together?
Re: AI consciousness vs AI safety, see the new AGI Friday this week: https://agifriday.substack.com/p/welfare
Degree of consciousness: there is a clear threshold in human development where we reach identity and a narrative self, actually thinking and talking meaningfully about "I" and "me". I would think these thresholds would be an important consideration for ASI.
At the same time, I do agree that an ASI could look at human suffering-- even peer into our brain and perceive it directly-- and yet conclude "They do not know true pain". Just as we deny children privileges, ASI could do the same to us. Hopefully it would not curtail our privileges any more severely than that.
Narcissim: I do see narcissism as a damaged self model(in its most malignant form narcissism-personality-disorder); it's a coping defense against pathological insecurities. Psychopathy/sociopathy/narcissism all shade into each other I think. https://pmc.ncbi.nlm.nih.gov/articles/PMC10187400 puts it as "pervasive and consistent difficulty maintaining realistic self-esteem".
But the point is well taken, we don't understand human models, so we can't hope to reliably guide an ASI. Hopefully study of AI self-models will be in some sense simpler and more illuminating.
> https://agifriday.substack.com/p/welfare
Agreed with that, its too early. This 2023 article using a very principled approach to consciousness-measuring mostly flew under everyone's radar, but it as has a lot of luminaries on it https://arxiv.org/pdf/2308.08708. There is little reason to think today's models are conscious, but there are no technical barriers to building conscious AI systems. When researchers eventually target building consciousness (as I believe they must), all these sticky issues will be thrust to the forefront.