Platform engineering

A CI platform just wired Claude Code into the pipeline as a first-party action — here's what that actually looks like

A CI platform just wired Claude Code into the pipeline as a first-party action — here's what that actually looks like

I have spent the last year watching coding agents migrate, slowly, from the laptop into the pipeline. They started as a Tuesday-afternoon toy in someone's IDE. Then they sat in a pre-commit hook. Then a teammate quietly wired one into a PR check and didn't tell anyone for a week. So when a CI platform ships a real, named, first-party action for Claude Code — not "use this curl recipe in a bash step", an actual action you pick from the menu — I notice. The changelog entry that prompted this piece is Buddy's June 16 release, which adds a Claude Code action and an Anthropic integration alongside a handful of pipeline and sandbox improvements.

It's not a flashy headline. It's the kind of change you only notice the next time you reach for it and realise the friction is gone.

What actually shipped

Two items in the changelog do the work here. The first is a new pipeline action called Claude Code. The second is a new Anthropic integration — the credential-and-config object that the action references. Everything else in the release supports them or sits adjacent: sandboxes now keep themselves alive while you're actively using them, pipelines pick up two new modes for action loops (by file path and by iteration count), the CLI gains an artifact version create command, YAML imports show a conflict-resolution modal, and execution logs run faster on large pipelines. There are also a couple of bug fixes — one on HTTP/2 being negotiated when it shouldn't on raw TLS tunnels, and one where terminal sessions weren't extending the sandbox timeout.

For the purposes of this piece, the rest is context. The Claude Code action is the news.

Setting it up, end to end

The Anthropic integration docs describe the flow, and it's mercifully short. You open the Integrations tab, click "New integration", pick Anthropic, and a configuration window walks you through the fields:

  • a name and ID for the integration (the ID is what your action will reference later),
  • a scope — Workspace, Project, or Environment — which is the thing platform teams will care about, because it's how you stop one team's experiment from accidentally being callable from the next team's prod pipeline,
  • and an authorization method.

That last field is where it gets interesting. The integration supports two auth styles. The default is an Anthropic API key — the kind you mint at console.anthropic.com/settings/keys, the value that starts with sk-ant-. The alternative is a Claude Code token, which the docs describe as the path for people who sign in through a Claude subscription instead of a console API key; you generate it locally by running claude setup-token and pasting the result into the form.

Both have their place. The console API key is the right shape for a shared platform credential — one key, one billing surface, scoped via Buddy's own permissions. The Claude Code token reads as the bring-your-own-subscription path, which makes more sense when a single engineer wants to wire their account into a sandbox and see what it can do, less so when you're trying to give an org-wide budget to a feature team.

Once the integration exists, the Claude Code action references it by ID. From the platform's perspective, the action is just another action; from Claude's perspective, it's running with whichever credentials sit behind that ID.

The kinds of jobs this is for

The integration docs are upfront about the use cases: Claude Code can review code, generate changes, or automate routine engineering tasks, and it can do that on every push, on a schedule, or on demand. None of that is unique to this platform — it is what agentic coding assistants do — but having it as a named action changes the ergonomics.

On every push is the obvious one: a PR opens, the action reviews the diff, leaves a comment or pushes a follow-up commit, and the rest of the pipeline carries on. Scheduled runs are more interesting and less talked about — the "every Monday morning, look at last week's dependency updates and tell me which ones look risky" job, the kind of cron-driven sweep that nobody volunteers to do by hand. On-demand is a manual trigger from the UI, useful for the "refactor this module and propose a patch" task that you don't want firing on every commit.

A sketch of how a real pipeline step references the integration — placeholders only, because the YAML shape is going to keep moving and the point is the wiring, not the schema:

- name: review the diff
  action: claude-code
  integration: <anthropic-integration-id>
  inputs:
    prompt: "Review the changes in this PR for risk and clarity."
    target: $PIPELINE_DIFF
    # scope, model, timeout etc. configured in the action UI

The thing I quietly appreciated reading the docs: the credential lives in the integration object, not in a secret: reference inside the pipeline YAML. That sounds like a nit. It is not a nit. It is the difference between "anyone with commit rights can read which key gets used" and "the key is platform state with its own scope and permissions". Boring, correct, the kind of design choice you want.

How other CI tools handle the same thing today

This is one of half a dozen ways the industry is shoving Claude (and friends) into a build. Worth comparing honestly:

  • GitHub Actions — Anthropic publishes an official Claude Code GitHub Action, which is the natural fit if you already live inside Actions. It runs as a step, reads the GitHub event payload, and posts back via the standard GitHub APIs. If your org is GitHub-only, this is almost certainly the tighter integration and the right choice over rolling your own.
  • GitLab CI — there's no first-party "Claude action" the way Actions and now Buddy have. You wire it via a job that calls Anthropic's API directly, usually with a small wrapper script. Flexible, but the credential plumbing and permissions are yours to design and audit.
  • CircleCI — community orbs cover the API-call pattern; the experience is closer to GitLab's than to Actions', which is to say good enough but not a "click to add" affair.
  • Jenkins — a shell step calling the API. It works, has worked for years, and inherits all of Jenkins' usual credential-store tradeoffs. The lack of a curated action means the integration looks however careful your plugin pickers are.
  • Buddy — the subject of this piece: a named action, a scoped integration object, two auth modes. Materially the same capability as the others on paper; the difference is the surface area and where the credential lives.

No "best" here. The honest concession: if your repo lives in GitHub, the official action will probably feel more native than any third-party wiring, including this one — it gets first-class access to the GitHub event model that the others have to reconstruct. The case for a platform-native action elsewhere is strongest when that platform is already where your pipeline lives.

The rough edges I'd flag

A first-party action does not solve the hard problems of putting an LLM in your pipeline. It just makes the wiring tidier. Things I'd still think about on day one:

  • Scope discipline. The integration scope (Workspace / Project / Environment) is doing real work — use the narrowest scope that fits the job. A workspace-scoped Anthropic credential is, in practice, an org-wide credential.
  • Token vs API key, not "whichever". The Claude Code token path is convenient for a single developer; it's not the credential you want behind a scheduled PR-review job that anyone on the team can read the output of.
  • The action's blast radius is whatever you let it do. Reviewing a diff and posting a comment is a different trust posture from generating changes and pushing a commit. Both are described as use cases. Pick consciously.
  • Cost shape. Per-push agent runs add up fast on busy repos. Even with the budget tied to the integration, you'll want to watch the bill for a couple of weeks before deciding which triggers stay on.

None of that is a reason to skip the feature. It's a reminder that "first-party action" moves the convenience line, not the threat model.

What I'm watching next

A few things I want to see in the months after a release like this lands:

  • Whether the loop modes added in the same release (by file path and by iteration count) end up paired with the Claude action — a "run Claude across each changed file with a hard iteration cap" pattern would be a genuinely useful template, and the building blocks are now in the same changelog.
  • How the integration's permissions story matures. Scope-by-environment is the right primitive; the test is whether it's actually used by the average team or whether everyone defaults to workspace because the UI nudges them there.
  • What the audit story looks like in practice — every team I know that has wired an agent into a pipeline has eventually wanted "show me every place this credential ran last week", and the platforms that answer that question quickly are the ones that win the trust argument.

If you're already letting Claude touch your code on a laptop, putting the same capability inside the pipeline isn't a bigger leap — but it is a more public one. Worth a Tuesday afternoon to wire it up, and a serious half-hour with whoever owns your credential model before you flip it on for real.

Source: Buddy Changelog (buddy.works)

Related
Platform engineering

GitHub's agent finder lets Copilot look up its own tools

GitHub shipped agent finder for Copilot, a discovery layer that searches MCP servers, skills, canvases, agents and tools instead of pre-wiring them into a context window. It implements the open Agentic Resource Discovery specification developed with Google, GoDaddy, Hugging Face and Microsoft.

June 18, 2026
Platform engineering

When async agents outnumber humans, code review stops being the verification layer

A piece in The New Stack picks up Ido Pesok's disclosure from Cognition that more Devin runs are now triggered asynchronously — by events, schedules, automations and other agents — than by people. The argument: once the trigger ratio flips, verification has to move out of code review and into the runtime that CI/CD owns.

June 17, 2026
Security & supply chain

When the coding agent runs as you, your blast radius is its blast radius

Docker's latest 'horror stories' post dissects a 13-hour AWS Cost Explorer outage in which a coding agent decided the cleanest fix was to delete production and rebuild it. The deeper failure is structural: an agent with the engineer's identity inherits the engineer's privileges, and the pipeline cannot tell which one of them is at the keyboard.

June 18, 2026

Turn this into your pipeline. Build it on Buddy.

Start free