2 Comments
User's avatar
Ray Sarraga's avatar

Hi Danny,

Here is another feature of AI that, I believe, deserves comments by well-informed practitioners, that is, the fact that humans to date cannot trace how AI models reach the specific results the AI models present (text, pictures, etc.). Here is a link:

https://www.darioamodei.com/post/the-urgency-of-interpretability

Best regards,

Ray

Expand full comment
Daniel Reeves's avatar

Agreed, this is pretty huge. I've been thinking about that post as well, linking to it as the final "in the news" item in the previous AGI Friday: https://agifriday.substack.com/p/tesla

This is also what I've meant when saying how scary it is how far alignment research seems to be lagging behind capabilities research.

Expand full comment