When async agents outnumber humans, code review stops being the verification layer
Priya NairA few weeks ago I caught myself opening a PR that another tool — not a person — had drafted, and I realised I had not actually read the diff in full that day. I am not proud of that. But the muscle I built for reviewing code line by line is the same muscle I have been quietly relaxing to keep up with the pace the rest of my team is shipping at. The New Stack picked up a piece from Ido Pesok at Cognition this week that names the thing I had been refusing to name: more Devin runs are now triggered asynchronously — by events, schedules, automations and other agents — than by humans.
Sit with that for a second. The reviewer was never going to scale.
The moment the ratio flipped
For most of my career, "verification" was a thing a person did. Somebody decided to ship; somebody else read the diff; eventually two humans agreed and the bot merged. It was a bottleneck, but an honest bottleneck — every change had an entry point you could ask questions of.
What Pesok describes, and what The New Stack draws out, is a quieter inversion. When async triggers — events, scheduled jobs, automation, other agents — outnumber the "I want to ship this" human moments, most of what runs in your pipeline was not initiated by anyone you can pull into Slack. You can multiply reviewers, but you cannot multiply reviewers fast enough to catch up to a population of agents that triggers itself.
Why piling more onto CI does not save you
The first instinct, the one I had reading this, is to push back into CI. More checks. Stricter gates. Mandatory approval from a senior engineer. I have written that exact RFC. It is comforting because it lives in a place we already know — the pipeline.
The argument in the article is that this is the wrong layer. For cloud-native software, a change is not "true" the moment CI goes green. It is true when it is in front of traffic, talking to its neighbours, holding load, behaving — or not — in conditions a unit test never simulated. If an agent triggered the run, nobody verified intent on the way in. So verification has to be there on the way out: at runtime, in the same place where blast radius is real.
That framing has been bouncing around SRE writing for years. The trigger-ratio shift is what makes it urgent right now. Code review was a property of the change. Runtime verification is a property of the system.
What this looks like on a real engineer's week
I do not have the whole pattern on this yet. Nobody does. But I can name the shape of the shift, and the things I would want on my own pipeline tomorrow:
- Progressive rollout as the default, not as a feature-flag toggle somebody enabled once during a hackathon. Canaries with real success criteria, not just "no 5xx for ten minutes".
- Observability wired to the deploy event, so a rollout can tell the difference between a noisy SLO it should ignore and a regression it should reverse. The runtime can say "this change is wrong" — you have to let it.
- Rollback that is fast enough to actually use. If reverting takes a meeting, runtime verification was theatre.
- A clear story for which automated changes get the same path as a human one, and which ones get a slower lane. Not every async trigger has earned the same trust.
None of this is novel taken one piece at a time. What is novel is needing it on the boring services that used to coast on "the team reads every PR".
The rough edge I keep getting stuck on
Runtime verification is expensive. In instrumentation. In alerting hygiene. In the cultural patience to wait while a canary bakes when the dashboard looks fine. It is much, much easier to bolt one more required check onto a PR template and call the problem solved. I expect plenty of teams will keep doing the easy thing, mine included on the wrong week.
What I am watching next
Two things. First, whether pipeline tools start treating the trigger as first-class metadata — human, schedule, agent, upstream service — so verification can be routed differently per origin. A change that started life inside an automation does not deserve the same fast lane as a change a person owned end to end. Second, what the on-call rotation looks like in a team where most changes were never typed by a person. The pager is the honest layer in all of this. We will hear from it first.
Source: The New Stack (thenewstack.io)