This week I attended the STAMP Safety Design Workshop at MIT. I went in hoping to get some answers about AI safety. I didn’t get answers. I got better questions. That turns out to be the more valuable outcome.
Here’s why. The first move in STAMP-based safety design is to define the system, its goals, and its losses, where losses means anything a stakeholder would be pissed off about. That’s the actual definition. Not “failure modes.” Not “risk events.” Anything a stakeholder would be pissed off about. I love that framing because it is bracingly honest and it forces you to think about the whole picture, not just the parts you already instrument.
When I started applying that lens to the AI safety debate, something clicked. The entire conversation (the Senate hearings, the red-teaming frameworks, the responsible AI checklists) is built on a category error. And category errors are special. They don’t just produce wrong answers. They make it impossible to ask the right questions.
A category error is when you ascribe a property to something that cannot, by its nature, possess that property. “The number seven is heavier than the number four.” “That melody smells like pine.” These statements aren’t false in the ordinary sense; they’re not even in the right zip code of falseness. They belong to the wrong frame entirely.
AI is not a system. AI is a component of a system. Systems can be safe or unsafe. Components cannot.
Continue reading →