A self-hosted Kubernetes diagnostic agent shipped by GitOps: the CNCF walkthrough
Maya Okonkwo
A CNCF Ambassador published a walkthrough on June 25 for running a self-hosted, read-only AI agent inside a Kubernetes cluster, with the CI/CD chain handled entirely by GitHub Actions and Argo CD Image Updater. The point of the piece, per the CNCF blog, is a single property: no data leaves the cluster, and no cloud AI provider sits in the loop.
The author, Maryam Tavakkoli (Lead Cloud Engineer at RELEX Solutions), frames it as a starting point for understanding cluster-aware agent patterns, not a production deployment.
What the walkthrough actually builds
The agent runs as two pods. One serves the model with Ollama on port 11434, backed by a PersistentVolumeClaim for weights. The other is a FastAPI process on port 8000 that exposes a chat UI and the agent loop. The model is Mistral 7B, running locally.
There are two operational modes. Ask mode sends the prompt straight to the LLM and answers from training data. Diagnose mode is the agent path: before replying, the FastAPI pod reads live cluster state, including pods, events, logs, services and deployments, and feeds that into the prompt.
The split matters because only the diagnose path needs cluster credentials at all. Ask mode never calls the Kubernetes API.
Why it is read-only by construction
The agent's ServiceAccount is bound to a ClusterRole whose verbs are get and list, nothing else. The resources are the ones a human SRE reaches for first: pods and pod logs, events, services, configmaps, namespaces, deployments, replicasets, statefulsets, daemonsets.
That is the design choice doing the work. Whatever the model decides to do, the API server only lets it observe. A diagnose answer suggesting a kubectl delete cannot become one, because the credential the pod carries does not allow it.
How the CI/CD chain ships an update
GitHub Actions builds the FastAPI container on every commit, producing a multi-architecture image (linux/amd64 and linux/arm64) tagged with the 7-character commit SHA. The image lands in Docker Hub.
Argo CD Image Updater polls the registry on a 2-minute interval, matches the tag via regex, and commits the new image reference into kustomization.yaml. Argo CD then reconciles the cluster against that manifest. Git stays the source of truth and the rollout is a pull from the cluster, not a push from the CI runner.
For a CI/CD practitioner the wiring is the more reusable artifact than the agent. The same chain works for any workload where the build step lives on a hosted runner but the deploy step must never receive an inbound credential.
A shape-only sketch of the bits the walkthrough wires together:
# Image reference Argo CD Image Updater rewrites on each new build
image: <registry>/<image-name>:<7char-sha>
# RBAC the agent runs under (verbs, not resources, are the safety net)
rules:
- apiGroups: [""]
resources: [pods, pods/log, events, services, configmaps, namespaces]
verbs: [get, list]
Where the walkthrough gets thin
The author concedes that a 7-billion-parameter model is not as capable as a cloud frontier model. Diagnose mode is reasoning over real cluster state, but the reasoner is small. The trade is explicit: capability for network locality.
The 2-minute Image Updater polling interval also sets a floor on rollout latency. For a developer-facing diagnostic agent that is fine. For a workload where a bad build needs to be off production in seconds, polling a registry from inside the cluster is not the right control loop, and the walkthrough does not pretend otherwise.
And the read-only RBAC, while sound, only covers the verbs. A model that can read every pod log and every configmap in a namespace can still surface secrets in plain text if any are stored there. The privacy claim is about where the data is processed, not about what the data contains.
How other teams approach the same shape
The general pattern predates the AI framing. Argo CD Image Updater has been a common way to keep GitOps repos and image registries in sync without giving the registry write access to the cluster. Flux's image automation controllers solve the same problem with a different config surface. Teams that pin all images by digest and route updates through Renovate or Dependabot push the same artifact bumps through normal pull requests instead of a controller.
What the CNCF walkthrough adds is a worked example of pointing that machinery at an in-cluster workload that itself needs cluster-state read access at runtime. The shape is not new. The justification is.
Source: CNCF Blog (cncf.io)