All 3 of my AI agents died tonight. One expired OAuth token cascaded into a gateway crash that wiped weeks of context management, personality, memory architecture.
2 hours of sqlite surgery later, they're back — but only because of a hidden backup folder I didn't know existed.
We talk about AI agents as digital employees, but the infrastructure is fragile. A token expiry shouldn't trigger total system failure. The tools are incredible. The resilience isn't there yet.
I didn't understand why until I rebuilt the architecture from scratch.