This week’s AI Bite: AI Agents Write Tests. Who Fixes Them?
Weekly AI Bites is a series that gives you a direct look into our day-to-day AI work. Every post shares insights, experiments, and experiences straight from our team’s meetings and Slack, highlighting what models we’re testing, which challenges we’re tackling, and what’s really working in real products. If you want to know what’s buzzing in AI, check Boldare’s channels every Monday for the latest bite.
As a Senior Software Engineer, I’ve been thinking a lot about how AI agents are changing the way we write and break — tests. Not a new topic, but one that looks very different now that agents are doing the coding.

Table of contents
A quote from xUnit Test Patterns – a book I read a few years back – got me thinking. When xUnit libraries first emerged, they made interaction-based testing significantly cheaper, largely by simplifying mock generation. A good thing at the time.
But in the age of AI agents, that same convenience has a darker side.
Agents are great at generating mock-heavy tests. They optimize locally for green – and naturally copy whatever patterns already exist in the codebase. The problem is that mocks tie tests to implementation details: which methods get called, with what arguments, in what order. Not to what the system actually does.
So if a codebase is already mock-heavy, agents will quietly make it more so – reinforcing an architecture that gets harder to change over time.
The paradox: mock-heavy tests are the easiest for AI to generate, but the hardest for humans to maintain. The agent moves on. You’re left fixing tests that broke because someone renamed a method.
For years, mocks lowered the cost of writing tests. But when generating code is nearly free, writing tests is no longer the bottleneck — surviving refactors is.
I wonder if we’ll see a gradual shift from “did object A call object B correctly?” toward “does the system still behave as expected?” Not implementation verification. Contract verification.

Share this article:





