ProductHow It WorksHow It ConnectsPricing
ResourcesBlogFAQAbout
Get in touch
The mess was mine

The mess was mine

Mar 25, 2026

By Marco Santelli

4 min read

You don't notice the moment it shifts. For the first few days, working with agents feels like the thing you've been waiting for your entire career — you describe what you want, the code appears, and your brain is finally free to think about the things it was always too busy typing to think about. Architecture. Long-term direction. The shape of the system three months from now. You feel like you have superpowers, and for a while that feels incredible.

The problem is that your brain, once freed, goes wherever it wants. That's what brains do — they follow the interesting thread, the one that lights up, the one that feels like progress. So while the agent is building your settings page, you're sketching the architecture for the next module, and while it's writing your data layer, you're answering an email from the infra team about something unrelated. You're doing exactly what a senior engineer should do — thinking at a higher level while the implementation takes care of itself.

Except the implementation takes care of what you asked for, which is a different thing entirely. The agent you pointed at a settings page also built a theme builder, because from its perspective that was a reasonable extension of the task — sixty files you didn't ask for, approved between calls because the summary looked fine and the tests passed. The agent you asked for a data layer produced something that works beautifully against the test suite and fakes half the validation, because the rule that all input goes through a schema first lives in your head, and your head was three tasks away when the code was written. The agent did exactly what agents do — it built something that works. The distinction between something that works and something that's right is yours to maintain, and you weren't there to maintain it.

This is the part that nobody warns you about when they talk about velocity. Agents write decent code, mostly. The problem is that every constraint you've built over twenty years of writing software lives in places an agent can't reach — in your taste, in the way you flinch when you see a helper function that duplicates something two directories over, in the instinct that says "this compiles and it's wrong" before you can even articulate why. When you're focused, all of that fires automatically. When you're multitasking across four parallel streams, all of it goes quiet. And the gap between those two states — between the engineer who reviews with judgment and the engineer who skims with confidence — is where the rot starts.

I sat down on a Saturday morning to do a refactor and the codebase felt like someone else's. Every file was fine in isolation, actually. Together, they told no coherent story. Patterns shifted between directories. Abstractions sat next to raw implementations of the same thing. Naming conventions wandered.

It was a book where every chapter had been written by a different author, and the person who should have been holding the narrative together — me — had been too busy feeling productive to notice that productivity and coherence are different things.

The mess was made by an absent tech lead who happened to also be the only developer on the project. I had mistaken delegation for leadership, and output for direction, and the codebase was the receipt for that mistake.

If you've been working with agents for any length of time, you've probably felt the edge of this — that low-level unease when you approve something a little too fast, when the codebase starts feeling unfamiliar even though you built every part of it, when you catch yourself re-explaining a constraint you thought was obvious and realising it was only obvious to you. That's the gap between how fast you can build and how well you can lead what you're building, and better prompting and smarter models alone will never close it.

What closes it is structure — the kind that holds context between sessions so you're not re-explaining yourself every morning, that makes decisions visible so they leave a trail you can follow, and that keeps your agents working within boundaries you've actually defined rather than boundaries you assumed were obvious. But that's the next part of this story.


First, let me tell you what happened when one of those agents decided to improve my database schema while I was on a call, and three repos stopped working before I finished my coffee.

I'm building Swarmix to solve this problem.

A workspace where you lead a team of AI agents across your repos. Human-in-the-loop by design.

Join the Waitlist