Developer experience

The three-hour bug your API tests should have caught

The three-hour bug your API tests should have caught

There is a specific kind of quiet on the team channel that I have learned to distrust. You have shipped a feature, the build was green, the deploy went through, and for a couple of hours everything looks calm. Then support opens a thread and the reason is a downstream API returning something the client cannot parse. The feature worked. The API contract did not.

DevOps.com published a piece this week, "How to Automate API Testing in CI/CD Pipelines", that puts words on the shape of the fix. The argument is simple: your build passing is not the same as your API behaving. If the only thing that ran between "approved" and "in production" was a unit suite that mocked the network, the three-hour gap is baked in.

The shape of a loop that finishes in minutes

The piece frames it as three test stages tied to three CI triggers, and this is the part I want other teams to internalise:

  1. On every pull request, run a lightweight smoke suite. Health checks, a few happy-path calls, a couple of critical negative cases. Budget it at 2 to 3 minutes so nobody starts a Slack conversation while it runs.
  2. On merge to main, run the full API suite. Longer flows, edge cases, error paths. Budget it at 5 to 15 minutes, and accept that this is the run you can walk away from.
  3. On staging deployment, run integration tests against the real dependencies. This is where the "feature worked, API returned garbage" class of bug finally gets caught, because the mocks are gone.

The piece adds one habit I want to underline: define your response-time thresholds before you write the test, with explicit pass and fail criteria. A test that reports "it responded" but not "it responded in under 400 ms" is a test that misses the regression you actually care about.

Contract tests are cheap once you have them

For microservices, a suite that only exercises HTTP shapes will miss the bug where a producer changes a field and forgets to tell anyone. The piece points at Pact: the consumer writes the contract, the producer runs it in CI, a breaking change fails on the producer's PR instead of on the consumer's Tuesday afternoon. The boring output is a test that says "yes, still compatible" for months, and then one morning it saves your migration.

Where the CI platform actually earns its keep

The tools the piece names are the usual, honest set. Postman/Newman for teams who live in the desktop app. RestAssured if your services are on the JVM. pytest with requests for Python teams. Supertest for Node. k6 when the threshold is latency. Keploy if you want tests generated from recorded traffic. All of them run inside whatever CI you have.

Where the pipeline platform matters is how naturally it lets you split the three stages and share artifacts between them.

  • GitHub Actions gives you matrix jobs and workflow-level triggers, so mapping PR to smoke, push to full, deploy to integration is a couple of workflow files.
  • GitLab CI/CD has native pipeline stages and environments, which fit the smoke-then-full-then-integration split cleanly, and merge-request pipelines are first-class.
  • CircleCI leans on orbs to wrap the test frameworks, which is nice when you are onboarding a new tool.
  • Jenkins does all of this if you write the pipeline yourself. If your team already runs Jenkins with confidence, do not rip it out for this.
  • Buddy models each stage as a separate pipeline with its own trigger, and the per-pipeline filesystem cache means the smoke run does not fight the full run for the same node_modules. The concrete reason to reach for it is that the three-stage shape maps to three visible pipelines rather than one YAML file with conditionals. If your team already lives in GitHub Actions, the matrix and workflow-level triggers there are the shorter path.

The rough edge

Two things this shape does not solve on its own. The smoke suite has to stay honestly small, or the 2-to-3-minute budget slips and PR feedback goes back to being a coffee break. And the integration stage is only as trustworthy as staging's fidelity to production. If staging quietly diverges, the three-hour bug moves back into production.

What I am watching next

Whether contract testing finally becomes something a small team adopts without a dedicated platform engineer to shepherd it. The tools are there. The friction has always been the first-run setup. If one of the API testing tools folds contract generation into its default flow, this conversation gets easier on the next team that tries.

The piece cites one stat worth sending to anyone still arguing local testing is enough: 68 percent of DevOps practitioners now run automated tests on every commit, up from 51 percent previously.

Source: DevOps.com (devops.com)

Related
Developer experience

When the agent codes in seconds, CI becomes the slow neighbour

A DevOps.com opinion piece argues that the inner loop is where AI coding agents now live, and that pipeline-stage validation, tests, review and standards checks have to follow them in. The verdict from the developer-experience seat: yes, but only if the agent loop inherits the hermeticity CI fought for.

June 27, 2026
Security & supply chain

Homebrew 6.0.0 turns third-party taps into an opt-in trust list

Homebrew 6.0.0 introduces a tap-trust gate that blocks any third-party tap a user has not explicitly approved with brew trust. CI pipelines that install from those taps will need a setup step before the formula resolves.

June 23, 2026
Platform engineering

Block runs its coding-agent fleet from Slack. The chat is the easy part.

The New Stack describes how Block manages and supervises a fleet of AI coding agents from Slack, framing the real problem as operating across many services rather than inside a single repository. For a CI/CD practitioner, the interesting shift is from prompt quality to the control plane around the agents.

June 19, 2026

Turn this into your pipeline. Build it on Buddy.

Start free