Containers
Container checkpoint #
The rollout checkpoint still had one unrealistic shortcut. Mandelbrot was running as a Kubernetes workload, but the application code was not delivered as an application image. The cluster had the shape of a platform, while the app release path still behaved like a development shortcut.
This checkpoint fixes that. Mandelbrot is now a container image, pushed to the cloud provider registry for the cluster that will run it:
AWS -> ECR
GCP -> Artifact Registry
Azure -> Azure Container Registry
The second change is more important than the Dockerfile. Infrastructure deployment and application deployment are now separate. A change to the Mandelbrot UI or server should not run pulumi up for EKS, GKE, and AKS. The cluster is the platform. The app is a workload running on that platform.
The final shape is:
infra/platform change -> Pulumi Deploy -> clusters, registries, Argo CD, platform apps
app change -> Mandelbrot Deploy -> image build, image push, auto-merged GitOps PR
rollout operation -> Mandelbrot Rollout -> sync, status, promote, abort, undo
That is the operating model I wanted this checkpoint to prove.
The image #
The Mandelbrot image is intentionally small. It starts from Node 22 Alpine, copies only the runtime files, runs as the node user, and exposes port 8080:
FROM node:22-alpine
ENV NODE_ENV=production \
PORT=8080
WORKDIR /app
COPY base/app/index.html base/app/server.js ./
RUN chown -R node:node /app
USER node
EXPOSE 8080
CMD ["node", "/app/server.js"]
The .dockerignore is stricter than the Dockerfile:
*
!Dockerfile
!base/
!base/app/
!base/app/index.html
!base/app/server.js
That keeps the build context from accidentally carrying Pulumi files, kubeconfigs, workflow scripts, or local output into the image build. For this app there is no package install step and no build artifact. The app is a single Node server plus one HTML file.
The Kubernetes manifest now uses a normal image field:
containers:
- name: mandelbrot
image: mandelbrot:dev
imagePullPolicy: Always
The base stays generic. The cloud overlays own the real image reference.
Provider registries #
Each cloud stack creates the registry close to the cluster:
AWS ECR repository: trinity-dev-aws-mandelbrot
GCP Artifact Registry repository: trinity-dev-gcp-mandelbrot
Azure ACR registry: trinitydevazureacr
The build script resolves the correct image repository from the selected cloud and environment:
aws <account>.dkr.ecr.us-east-1.amazonaws.com/trinity-dev-aws-mandelbrot
gcp us-central1-docker.pkg.dev/trinity-k8s/trinity-dev-gcp-mandelbrot/mandelbrot
azure trinitydevazureacr.azurecr.io/mandelbrot
It then logs in with the provider CLI, builds the Docker image with the requested tag, and pushes it:
npm run push:mandelbrot-image -- \
--cloud aws \
--tag sha-36f1855b6a8139ea26be13dc6616477f30d3b4e1
Pull requests use pr-<number>-<sha> tags. Main app releases use sha-<sha> tags.
The PR tag is useful because it proves the container build and registry push before merge. The SHA tag is useful because it is immutable and maps back to a Git commit without needing a separate release numbering system.
Pull permissions #
The provider-specific pull model was the first place where the clouds diverged.
AWS and GCP can use cloud IAM on the node side:
- AWS nodes pull from ECR through the EKS node role.
- GCP nodes pull from Artifact Registry through the node service account binding.
Azure was the only awkward one. The production-shaped path is not a Kubernetes pull secret. It is Azure RBAC: give the AKS kubelet identity AcrPull on the registry and keep ACR admin credentials disabled.
The first implementation tried to do that, but the deployment identity only had Contributor. Contributor can create a lot of resources, but it cannot assign roles. Azure rejected the deployment with:
does not have authorization to perform action
'Microsoft.Authorization/roleAssignments/write'
That failure was useful because it separated two permissions that are easy to blur together:
- creating the registry
- granting another identity permission to pull from it
The stricter setup gives the GitHub Azure identity Role Based Access Control Administrator in the bootstrap stack. The cluster stack can then assign AcrPull to the AKS kubelet identity:
new authorization.RoleAssignment(`${clusterName}-mandelbrot-acr-pull`, {
scope: containerRegistry.id,
roleDefinitionId: acrPullRoleDefinitionId,
principalId: kubeletIdentityObjectId,
principalType: authorization.PrincipalType.ServicePrincipal,
});
The ACR registry now has adminUserEnabled: false, and the Azure overlay no longer needs imagePullSecrets. Image pushes still use the Azure CLI identity in GitHub Actions. Runtime pulls use the kubelet identity.
GitOps image ownership #
There was one design decision to make after the image existed:
Where should the desired image tag live?
At first, Pulumi patched the Mandelbrot Argo CD application with spec.source.kustomize.images. That worked, but it coupled app release to cluster deployment. It also made the app image tag an infrastructure concern.
The final model moves the image tag into GitOps:
apiVersion: kustomize.config.k8s.io/v1beta1
kind: Kustomization
resources:
- ../../base
images:
- name: mandelbrot
newName: us-central1-docker.pkg.dev/trinity-k8s/trinity-dev-gcp-mandelbrot/mandelbrot
newTag: sha-7402835a5f8e272372444f14417234b60600f04b
Each cloud overlay has its own images entry. Argo CD reads the overlay from Git, renders the cloud-specific image reference, and applies the Rollout.
That means app releases create Git changes. The release record is visible:
image pushed -> sha-...
overlay changed -> sha-...
Argo syncs -> sha-...
Rollback is also simple: revert the image tag commit or merge a new one pointing at an older SHA.
The app deployment workflow #
The new Mandelbrot Deploy workflow runs only for app delivery concerns:
on:
push:
branches:
- main
paths:
- ".github/workflows/mandelbrot-deploy.yml"
- "apps/mandelbrot/Dockerfile"
- "apps/mandelbrot/.dockerignore"
- "apps/mandelbrot/base/**"
- "scripts/build-push-mandelbrot-image.mjs"
- "scripts/update-mandelbrot-gitops-images.mjs"
- "package.json"
- "package-lock.json"
workflow_dispatch:
It uses a three-cloud matrix:
build-and-push aws
build-and-push gcp
build-and-push azure
Each matrix job authenticates to one cloud with GitHub OIDC, builds the image, pushes it to that cloud registry, and uploads a small metadata file:
{
"cloud": "gcp",
"repository": "us-central1-docker.pkg.dev/trinity-k8s/trinity-dev-gcp-mandelbrot/mandelbrot",
"tags": ["sha-36f1855b6a8139ea26be13dc6616477f30d3b4e1"],
"images": [
"us-central1-docker.pkg.dev/trinity-k8s/trinity-dev-gcp-mandelbrot/mandelbrot:sha-36f1855b6a8139ea26be13dc6616477f30d3b4e1"
]
}
The update-gitops job downloads those metadata files and updates the three Kustomize overlays. Then it renders the overlays with kubectl kustomize before it tries to publish the release change.
At first the workflow pushed the tag commit straight to main. The repository rejected that:
Repository rule violations found for refs/heads/main.
Changes must be made through a pull request.
That is the correct repository rule. The workflow now pushes to a release branch:
mandelbrot/deploy-<sha>
and opens a pull request:
gh pr create \
--base "${GITHUB_REF_NAME}" \
--head "${RELEASE_BRANCH}" \
--title "${RELEASE_TITLE}"
The workflow then enables auto-merge for that PR:
gh pr merge "${pr_number}" \
--squash \
--auto \
--delete-branch
That keeps the repository rule intact: main still changes through a pull request. The difference is that generated image-tag PRs do not need routine human handling. If branch protection requires checks or approval, GitHub waits for those requirements before merging. The repository also has to allow auto-merge for this to work.
The workflow trigger deliberately watches the app source, Dockerfile, and release scripts, but not apps/mandelbrot/overlays/**. The generated image-tag PR changes the overlays. Excluding that path prevents the auto-merged release PR from triggering another image build for the same application code.
There was one more small rerun bug. The first version used git push --force-with-lease, but on a rerun the runner did not have a local lease for the remote release branch and Git rejected the push as stale. The release branch is owned by this workflow and includes the commit SHA in its name, so the final version uses a plain force push to that branch:
git push --force origin "HEAD:${RELEASE_BRANCH}"
It still does not force-push main.
Infrastructure deployment after the split #
Pulumi Deploy now ignores routine app changes. It is still responsible for:
- clusters
- registries
- cloud pull permissions
- Argo CD
- platform applications
- traffic resources
- destroy ordering
It no longer builds the Mandelbrot image and no longer injects the Mandelbrot image tag into the Argo CD application.
This also removed the Pulumi config path for MANDELBROT_IMAGE_TAG. That value belongs to the app release workflow now, not the cluster stack.
That split is the important checkpoint. Infrastructure can be slow, deliberate, and protected. Application release can be smaller and repeatable without touching the cluster control plane.
Argo CD ownership lesson #
One bug exposed a useful GitOps boundary. I briefly let Pulumi create or patch the child Mandelbrot Argo CD application directly. The root Argo CD application owns the child application list from Git. If the child app is not in the root tree, Argo can prune or fight the Pulumi-managed version.
The final shape restores the static child applications under:
platform/argocd/applications/aws/mandelbrot.yaml
platform/argocd/applications/gcp/mandelbrot.yaml
platform/argocd/applications/azure/mandelbrot.yaml
Pulumi bootstraps Argo CD and the root application. The root application owns the child applications. The child Mandelbrot applications point at the cloud overlays. The overlays own the image reference.
That is a cleaner ownership chain:
Pulumi -> Argo root
Argo root -> child Applications
Mandelbrot Application -> cloud overlay
cloud overlay -> image tag
Verification #
The final verification was intentionally boring. All Argo CD applications returned to Synced and Healthy across the three clouds:
== aws ==
trinity-dev-aws-root Synced Healthy
trinity-mandelbrot-aws Synced Healthy
trinity-observability-aws Synced Healthy
trinity-policies-aws Synced Healthy
trinity-policy-engine-aws Synced Healthy
trinity-rollouts-aws Synced Healthy
trinity-secrets-aws Synced Healthy
trinity-secrets-demo-aws Synced Healthy
== gcp ==
trinity-dev-gcp-root Synced Healthy
trinity-mandelbrot-gcp Synced Healthy
trinity-observability-gcp Synced Healthy
trinity-policies-gcp Synced Healthy
trinity-policy-engine-gcp Synced Healthy
trinity-rollouts-gcp Synced Healthy
trinity-secrets-gcp Synced Healthy
trinity-secrets-demo-gcp Synced Healthy
== azure ==
trinity-dev-azure-root Synced Healthy
trinity-mandelbrot-azure Synced Healthy
trinity-observability-azure Synced Healthy
trinity-policies-azure Synced Healthy
trinity-policy-engine-azure Synced Healthy
trinity-rollouts-azure Synced Healthy
trinity-secrets-azure Synced Healthy
trinity-secrets-demo-azure Synced Healthy
For the workload itself, the useful checks are:
for cloud in aws gcp azure; do
KUBECONFIG=./kubeconfig.${cloud}.yaml \
kubectl -n mandelbrot get rollout mandelbrot
KUBECONFIG=./kubeconfig.${cloud}.yaml \
kubectl -n mandelbrot get rollout mandelbrot \
-o jsonpath='{.spec.template.spec.containers[0].image}{"\n"}'
done
The image tag should match the GitOps release PR. If the rollout pauses at the canary step, the existing Mandelbrot Rollout workflow promotes it.
What changed #
This checkpoint changed the platform from "Kubernetes can run the app" to "the platform can release the app."
The work added:
- a real Docker image for Mandelbrot
- provider registry creation and image push support
- image pull support in EKS, GKE, and AKS
- PR image validation with
pr-<number>-<sha>tags - main release images with
sha-<sha>tags - GitOps-owned image references in the Kustomize overlays
- a separate app deploy workflow
- an auto-merged PR-based release path compatible with branch protection
The result is a more realistic boundary:
Pulumi changes the platform.
GitOps changes the app release.
Argo CD applies the declared release.
Argo Rollouts controls promotion.
That is the smallest containerisation step that still behaves like a real platform.