Decoder · December 4, 2025 · 60m

The Tiny Team Trying to Keep AI from Destroying Everything

Hayden Field profiles Anthropic's societal impacts team — a small group tasked with ensuring one of the world's most powerful AI companies does not cause irreversible harm.

Canon

•

The Stoic challenge of AI safety — controlling process while accepting uncertainty about outcomes

Anthropic's safety team can control their evaluation rigor and internal processes but cannot guarantee their models will not cause harm. The Stoic approach focuses on process excellence.

Claude ChatGPT Gemini

•

Meaning from existential responsibility — the weight of keeping AI safe

Anthropic's safety team derives deep meaning from the gravity of their responsibility, but this meaning comes with a psychological cost that must be managed.

Claude ChatGPT Gemini