Kubernetes Deployment

Deploy MindRoom on Kubernetes for production multi-tenant deployments.

Architecture

MindRoom uses three Helm charts:

Instance Chart (cluster/k8s/instance/) - Individual MindRoom runtime with bundled dashboard/API plus Matrix/Synapse
Platform Chart (cluster/k8s/platform/) - SaaS control plane (API, frontend, provisioner)
Runtime Chart (cluster/k8s/runtime/) - MindRoom runtime only, for clusters that provide Matrix, storage, secrets, ingress, and platform services externally

Prerequisites

Kubernetes cluster (tested with k3s via kube-hetzner)
kubectl and helm installed
NGINX Ingress Controller
cert-manager (for TLS certificates)
Hetzner Cloud Controller Manager and Hetzner CSI Driver when using hcloud-volumes

Hetzner Scaling Baseline

Start with one K3s server node and keep tenant storage on Hetzner Cloud Volumes. This keeps the current bill low while making later worker nodes possible. Use hcloud-volumes for provisioned instance PVCs and avoid local-path for tenant data that must survive node replacement. The platform frontend and backend are stateless and can use HPA once metrics-server is installed. Tenant MindRoom and Synapse releases stay single-replica stateful workloads; scale capacity by placing more tenants across more nodes.

Production platform values should make the storage class explicit:

provisioner:
  instanceStorageClassName: hcloud-volumes

autoscaling:
  enabled: false

Enable autoscaling.enabled only after the cluster has metrics-server and enough nodes or headroom to schedule the extra replicas.

Instance Deployment

Via Provisioner API (Recommended)

export KUBECONFIG=./cluster/terraform/terraform-k8s/mindroom-k8s_kubeconfig.yaml

# Provision, check status, view logs
./cluster/scripts/mindroom-cli.sh provision 1
./cluster/scripts/mindroom-cli.sh status
./cluster/scripts/mindroom-cli.sh logs 1

Direct Helm Installation

For debugging only:

helm upgrade --install instance-1 ./cluster/k8s/instance \
  --namespace mindroom-instances \
  --create-namespace \
  --set customer=1 \
  --set accountId="your-account-uuid" \
  --set baseDomain=mindroom.chat \
  --set anthropic_key="your-key" \
  --set openrouter_key="your-key" \
  --set supabaseUrl="https://your-project.supabase.co" \
  --set supabaseAnonKey="your-anon-key" \
  --set supabaseServiceKey="your-service-key"

Only enable trusted upstream auth when the instance is behind a verified access layer that strips client-supplied copies of those headers and injects authenticated values itself:

helm upgrade --install instance-1 ./cluster/k8s/instance \
  --namespace mindroom-instances \
  --reset-then-reuse-values \
  --set-string trustedUpstreamAuth.enabled=true \
  --set trustedUpstreamAuth.userIdHeader=X-MindRoom-User-Id \
  --set trustedUpstreamAuth.emailHeader=X-MindRoom-User-Email \
  --set trustedUpstreamAuth.matrixUserIdHeader=X-MindRoom-Matrix-User-Id \
  --set trustedUpstreamAuth.emailToMatrixUserIdTemplate='@{localpart}:example.org'

When using the provisioner, configure the platform chart with provisioner.trustedUpstreamAuth.enabled="true" and the matching provisioner.trustedUpstreamAuth.*Header values. If your access layer cannot supply a Matrix ID header, configure provisioner.trustedUpstreamAuth.emailToMatrixUserIdTemplate with the same template. The email-to-Matrix template must contain exactly one {localpart} placeholder and requires the matching emailHeader value in both the instance and platform chart configuration.

Runtime-Only Deployment

Use the runtime chart when you already operate the surrounding platform and only want Kubernetes to run the MindRoom runtime.

The chart intentionally does not create Matrix, ingress, a model gateway, or platform services.

helm upgrade --install mindroom-runtime ./cluster/k8s/runtime \
  --namespace mindroom \
  --create-namespace \
  -f runtime-values.yaml

Typical production values point at existing resources:

config:
  create: false
  existingConfigMap: mindroom-config
  key: config.yaml

storage:
  create: false
  existingClaim: mindroom-data

matrix:
  homeserverUrl: http://matrix.example.svc.cluster.local:8008
  serverName: example.com
  registrationToken:
    existingSecret: mindroom-secrets
    key: MATRIX_REGISTRATION_TOKEN

env:
  envFrom:
    - secretRef:
        name: mindroom-secrets

workers:
  backend: kubernetes
  sandbox:
    proxyToken:
      existingSecret: mindroom-sandbox-proxy
      key: MINDROOM_SANDBOX_PROXY_TOKEN

See cluster/k8s/runtime/README.md and cluster/k8s/runtime/values.yaml for the full values surface.

Worker Backends

The instance and runtime charts support two worker backend modes for worker-routed tools such as shell, file, and python.

The dedicated-worker provisioning flow is implemented today.

Both modes store agent data in the same per-agent directory structure.

Helm value	Behavior	Best for
`workerBackend: static_runner`	Runs one shared sandbox-runner sidecar inside the main MindRoom pod	Simpler deployments
`workerBackend: kubernetes`	Creates dedicated worker Deployments and Services on demand	Stronger runtime isolation per agent (filesystem isolation depends on `worker_scope`)

Shared Sidecar Mode

workerBackend: static_runner is the default. The primary runtime talks to a shared sidecar over localhost. This keeps the deployment simple, but all proxied tool calls share the same runner process. The runner reads and writes the same agent storage directories as the main process. When encrypted credential storage is enabled in Helm, configure the credential encryption key through a Secret-backed chart value so the primary runtime and static runner sidecar receive the same key.

Dedicated Worker Mode

workerBackend: kubernetes enables the built-in Kubernetes worker backend. The primary runtime creates worker Deployments and Services on demand and routes tool calls to the matching worker. Each worker pod runs the sandbox-runner app and accesses the same agent storage directory as every other runtime for that agent. Worker-local files (caches, virtualenvs, metadata) are kept separate per worker. When a worker is idle, its Deployment scales to zero, but agent data and worker caches are preserved. The runtime chart stores derived worker tokens and optional credential-encryption keys as per-worker entries in one chart-created worker-auth Secret when workers run in the release namespace. If workers.kubernetes.namespace is set to a separate worker namespace, the runtime chart can instead manage per-worker auth Secrets in that namespace. The hosted instance chart stores derived worker tokens and optional credential-encryption keys as per-worker entries in a pre-created tenant auth Secret. The hosted instance worker-manager Role does not grant broad Secret API access in the shared mindroom-instances namespace.

Warning

Filesystem isolation depends on worker_scope. With shared, user_agent, or unscoped execution, each worker can only see its own agent's storage directory — this is the strongest isolation available. With user, the worker can see all agents' storage because it shares one runtime across multiple agents for a single user. Use user_agent for per-agent filesystem isolation.

Typical Helm values look like:

workerBackend: kubernetes
workerCleanupIntervalSeconds: 30
storageAccessMode: ReadWriteMany
controlPlaneNodeName: ""
kubernetesWorkerImage: ""
kubernetesWorkerImagePullPolicy: ""
kubernetesWorkerServiceAccountName: ""
kubernetesWorkerNamePrefix: "mindroom-worker"
kubernetesWorkerStorageSubpathPrefix: "workers"
kubernetesWorkerPort: 8766
kubernetesWorkerReadyTimeoutSeconds: 60
kubernetesWorkerIdleTimeoutSeconds: 1800
sandbox_proxy_token: "replace-me"

The runtime chart exposes the same concepts under the nested workers.* values.

Important behavior and constraints:

kubernetesWorkerImage and kubernetesWorkerImagePullPolicy default to the main MindRoom image settings when left empty.
workerCleanupIntervalSeconds controls how often the primary runtime runs idle-worker cleanup.
kubernetesWorkerIdleTimeoutSeconds controls when a worker is considered idle and eligible to scale down.
kubernetesWorkerReadyTimeoutSeconds controls how long the primary runtime waits for a worker Deployment to become ready.
kubernetesWorkerPort is the internal Service and container port used by dedicated workers.
Dedicated workers need access to the shared instance PVC so they can reach agent storage directories.
For shared, user_agent, and unscoped execution, mounts are narrowed to just the target agent's directory plus the worker's scratch space.
Shared credentials are copied into each dedicated worker as needed instead of exposing the whole shared credentials directory inside agent-isolated pods.
Dedicated workers start with no shared credentials by default.
Only services listed in defaults.worker_grantable_credentials are available inside a dedicated worker.
google_vertex_adc is intentionally unsupported for dedicated workers because workers do not receive ADC files or GOOGLE_APPLICATION_CREDENTIALS; keep Vertex ADC usage in the primary runtime.
workers.kubernetes.extraEnv and MINDROOM_KUBERNETES_WORKER_ENV_JSON are filtered before reaching worker pods or startup manifests: generated worker env, runtime control env, Kubernetes backend config env, and vendor telemetry env are dropped.
The one sandbox-control value intentionally allowed through Kubernetes worker extra env is MINDROOM_SANDBOX_RUNNER_SUBPROCESS_TIMEOUT_SECONDS, so operators can tune runner subprocess timeouts.
Dedicated worker runtime env stays deny-by-default for provider and arbitrary .env values, while basic runtime plumbing such as PATH, VIRTUAL_ENV, and linker vars is set separately.
This matches the broader sandbox-proxy contract for python and shell: proxied execution is intentionally stricter than direct local execution and does not inherit ordinary runtime .env or provider env by default.
For agent-editable per-workspace env (extra PATH entries, package indexes, npm cache dirs, etc.), use the request-time .mindroom/worker-env.sh overlay documented in Sandbox Proxy Isolation. The overlay is sourced inside the running worker per request, so it does not change the worker Deployment, the startup manifest, the pod-template hash, or any Helm value, and does not require a worker restart when edited.
MindRoom-owned workspace identity, cache, and virtualenv env names remain controlled by the worker runtime and cannot be redirected by .mindroom/worker-env.sh: HOME, MINDROOM_AGENT_WORKSPACE, XDG_CONFIG_HOME, XDG_DATA_HOME, XDG_STATE_HOME, XDG_CACHE_HOME, PIP_CACHE_DIR, UV_CACHE_DIR, PYTHONPYCACHEPREFIX, and VIRTUAL_ENV.
Worker-local caches may still live under kubernetesWorkerStorageSubpathPrefix/<worker-dir>/.

Storage Requirements

Dedicated workers need access to the same PVC as the primary runtime. Set storageAccessMode: ReadWriteMany so multiple workers can access agent storage concurrently. If your storage class only supports ReadWriteOnce, set controlPlaneNodeName so the control plane and dedicated workers stay on the same node. The chart enforces this constraint during template rendering. For the hosted SaaS instance chart, keep the default static_runner backend on single-node clusters. Switching hosted instances to dedicated Kubernetes workers on a multi-node cluster requires either a real ReadWriteMany storage class or explicit node pinning.

RBAC And Network Policy

When workerBackend: kubernetes is enabled, the chart creates:

A worker-manager ServiceAccount for the primary runtime.
A Role and RoleBinding that allow managing worker Deployments and Services in the instance namespace.
In the runtime chart's default same-namespace mode, a chart-created worker-auth Secret plus narrow get and patch access to only that Secret.
In the runtime chart's explicit separate worker namespace mode, Secret CRUD for per-worker auth Secrets in that worker namespace.
In the hosted instance chart, a pre-created tenant worker-auth Secret plus narrow get and patch access to only that Secret.
NetworkPolicy rules that allow the primary runtime to reach the internal worker port while denying worker-to-worker runner ingress.

Operations

The authenticated dashboard API exposes /api/workers to list active or idle workers and /api/workers/cleanup to trigger cleanup manually. Dedicated workers are internal-only cluster Services and are authenticated with per-worker runner tokens derived from the primary runtime's sandbox_proxy_token. See Sandbox Proxy Isolation for the execution model, credential leases, and non-Kubernetes deployment modes.

Secrets Management

API keys are mounted as files at /etc/secrets/ (not environment variables). MindRoom reads paths from *_API_KEY_FILE environment variables:

env:
  - name: ANTHROPIC_API_KEY_FILE
    value: "/etc/secrets/anthropic_key"
  - name: OPENROUTER_API_KEY_FILE
    value: "/etc/secrets/openrouter_key"

For production SaaS instance provisioning, the platform backend creates mindroom-api-keys-{instance_id} directly with Kubernetes before running Helm. The instance chart is then rendered with instanceSecrets.create=false, instanceSecrets.name, and a non-secret instanceSecrets.hash, so tenant API keys and OIDC client secrets do not enter Helm release values or rendered Helm Secret manifests.

Ingress

Each instance gets three hosts:

{customer}.{baseDomain} - MindRoom dashboard and API
{customer}.api.{baseDomain} - Direct API access
{customer}.matrix.{baseDomain} - Matrix/Synapse server

Platform Deployment

# Create values file from example
cp cluster/k8s/platform/values-staging.example.yaml cluster/k8s/platform/values-staging.yaml
# Edit with your configuration

helm upgrade --install platform ./cluster/k8s/platform \
  -f ./cluster/k8s/platform/values-staging.yaml \
  --namespace mindroom-staging

The namespace must match mindroom-{environment} where environment is set in values. For production, set platformSecrets.create=false and pre-create the named Secret so API keys, webhook secrets, and Matrix OIDC private keys do not enter Helm release values. The Secret must contain the same keys rendered by the chart-managed platform-secrets Secret, including supabase_service_key, stripe_secret_key, stripe_webhook_secret, provisioner_api_key, instance_credentials_encryption_secret, provider API keys, and the optional matrix_oidc_* keys.

Platform ingress hosts:

app.{domain} - Platform frontend
api.{domain} - Platform backend API
webhooks.{domain}/webhooks/stripe - Stripe webhooks

Local Development with Kind

just cluster-kind-fresh              # Start cluster with everything
just cluster-kind-port-frontend      # http://localhost:3000
just cluster-kind-port-backend       # http://localhost:8000
just cluster-kind-down               # Clean up

See cluster/k8s/kind/README.md for details.

CLI Helper

./cluster/scripts/mindroom-cli.sh list              # List instances
./cluster/scripts/mindroom-cli.sh status            # Overall status
./cluster/scripts/mindroom-cli.sh logs <id>         # View logs
./cluster/scripts/mindroom-cli.sh provision <id>    # Create instance
./cluster/scripts/mindroom-cli.sh deprovision <id>  # Remove instance
./cluster/scripts/mindroom-cli.sh upgrade <id>      # Upgrade instance

Reads configuration from saas-platform/.env.

Provisioner API

All endpoints require bearer token (PROVISIONER_API_KEY).

Endpoint	Method	Description
`/system/provision`	POST	Create or re-provision an instance
`/system/instances/{id}/start`	POST	Start a stopped instance
`/system/instances/{id}/stop`	POST	Stop a running instance
`/system/instances/{id}/restart`	POST	Restart an instance
`/system/instances/{id}/uninstall`	DELETE	Remove an instance
`/system/sync-instances`	POST	Sync states between DB and K8s

Example provision request:

curl -X POST "https://api.mindroom.chat/system/provision" \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $PROVISIONER_API_KEY" \
  -d '{"account_id": "uuid", "subscription_id": "sub-123", "tier": "starter"}'

The provisioner creates the namespace, generates URLs, deploys via Helm, and updates status in Supabase. For provisioned instance charts, set provisioner.instanceCredentialsEncryptionSecret to a stable high-entropy value so the provisioner can derive stable per-instance credentials_encryption_key chart values. If provisioner.instanceCredentialsEncryptionSecret is unset, the provisioner falls back to PROVISIONER_API_KEY when generating keys for new instances or explicit existing-instance opt-ins. Keep this source stable because changing it changes future derived credential encryption keys. New provisioned instances receive a derived credential encryption key by default. When re-provisioning an existing instance, the provisioner preserves the current encryption state by reusing credentials_encryption_key from the existing instance Secret when present. To enable credential encryption for an existing keyless instance, include "enable_credentials_encryption": true in the POST /system/provision request body. Treat that opt-in as a one-way switch until a plaintext migration exists. If an existing instance still has plaintext credential files, enabling credential encryption makes those files unreadable and encrypted-mode saves refuse to overwrite them. Clear or replace stale plaintext credential files before enabling the flag. If an already-encrypted instance has lost its instance Secret, pass "enable_credentials_encryption": true during reprovisioning so Helm receives the stable derived key again.

Deployment Scripts

cd saas-platform
./deploy.sh platform-frontend          # Deploy platform frontend
./deploy.sh platform-backend           # Deploy platform backend
./redeploy-mindroom.sh         # Redeploy all customer MindRoom instances

Multi-Tenant Architecture

Each customer instance gets:

Separate Kubernetes deployment in mindroom-instances namespace
Isolated PersistentVolumeClaim for data
Own Matrix/Synapse server (SQLite)
Independent ConfigMap configuration
Dedicated ingress routes

Platform services run in mindroom-{environment} namespace. The hosted SaaS chart currently runs Synapse per tenant, with server names such as {customer}.mindroom.chat. The public mindroom.chat Matrix server is the separate Tuwunel server on hetzner-matrix. Do not reuse mindroom.chat as the default SaaS tenant homeserver until the product has an explicit shared-homeserver mode with tenant isolation, provisioning, room ownership, SSO, and deprovisioning semantics. The runtime-only chart is the better starting point for a future bring-your-own-homeserver or shared-homeserver mode.