Deploy
Take OrthID to production on Kubernetes: the data-plane and control-plane split, a complete Helm values file, sizing guidance, TLS, and the checks that confirm a healthy rollout.
This page covers a single-region production deployment. If you have not run OrthID at all yet, start with Self-host for prerequisites and a local stack. To run more than one region, deploy this topology once per region and read Regions.
Topology: data plane and control plane
OrthID separates two responsibilities so they can scale and fail independently.
- Data plane: the request-path services that verify sessions, issue tokens, mint agent credentials, and serve sign-in. This is the hot path. It is stateless, horizontally scaled, and the only tier that must be reachable from your apps. It holds an open connection to Postgres and your key provider but keeps no local state.
- Control plane: the management services behind the Operator and Tenant consoles, plus background workers (audit archiving, key rewrap, SCIM sync, scheduled exports). It tolerates brief downtime without affecting live sign-ins. Run fewer replicas; do not expose it to the public internet.
Both planes share one Postgres database and one key provider. Keeping them on separate deployments means a surge of console or batch activity cannot starve the sign-in path, and you can roll or scale each independently.
region; nothing replicates out of it. See Regions for multi-region patterns.Helm values
The chart from Self-host exposes both planes. Below is a production-shaped values file. Sensitive connection strings come from an existing Kubernetes secret, not from this file.
region: au-syd-1
publicUrl: https://id.acme.health
# Connection strings and other secrets live in a Secret, not here.
existingSecret: orthid
secrets:
provider: vault # customer-managed keys; see the BYOK guide.
# Data plane: the hot sign-in / verify / token path.
dataPlane:
replicaCount: 4
autoscaling:
enabled: true
minReplicas: 4
maxReplicas: 20
targetCPUUtilizationPercentage: 65
resources:
requests: { cpu: "500m", memory: "512Mi" }
limits: { cpu: "2", memory: "1Gi" }
# Control plane: consoles and background workers.
controlPlane:
replicaCount: 2
resources:
requests: { cpu: "250m", memory: "512Mi" }
limits: { cpu: "1", memory: "1Gi" }
storage:
endpoint: https://s3.ap-southeast-2.amazonaws.com
bucket: orthid-prod-au
ingress:
enabled: true
className: nginx
host: id.acme.health
tls:
enabled: true
secretName: orthid-tls # cert-manager populates thisApply it with a versioned upgrade so migrations run as a pre-upgrade hook:
helm upgrade --install orthid orthid/orthid \ --namespace orthid --create-namespace \ --values values.yaml \ --version 1.8.0
Sizing
OrthID is light on the request path because sessions.verify() validates JWTs locally and adds no network hop. The figures below are starting points for the data plane; measure and let autoscaling do the rest.
| Prop | Type | Default | Description |
|---|---|---|---|
Up to 100k MAU | data plane | 4 pods | 0.5 vCPU / 512Mi each. Postgres: 2 vCPU, 8Gi, 100Gi disk. |
Up to 1M MAU | data plane | 6 to 12 pods | Autoscale to 12. Postgres: 4 vCPU, 16Gi, with a read replica for the control plane. |
Control plane | any size | 2 pods | Consoles and workers. Scale by background job volume (SCIM, exports), not by sign-in traffic. |
Postgres | shared | - | The capacity constraint at scale. Size connections and IOPS for peak sign-in bursts; enable PITR. |
TLS
All traffic to OrthID must be HTTPS; the API rejects plaintext, and session cookies are set Secure. Terminate TLS at your ingress. The values file above expects a certificate in the orthid-tls secret, which cert-manager can issue and renew automatically.
apiVersion: cert-manager.io/v1
kind: Certificate
metadata:
name: orthid-tls
namespace: orthid
spec:
secretName: orthid-tls
dnsNames:
- id.acme.health
issuerRef:
name: letsencrypt-prod
kind: ClusterIssuerSet publicUrl to the exact HTTPS origin OrthID is reached at. It is baked into token issuer and audience claims and into redirect URLs, so a mismatch breaks sign-in and SSO callbacks.
Post-deploy verification
After the rollout, confirm both planes are healthy before sending real traffic. Check that pods are ready, that the data plane reports all dependencies connected, and that a TLS sign-in round trip works.
# 1. Both planes rolled out and ready.
kubectl -n orthid get pods
kubectl -n orthid rollout status deploy/orthid-data-plane
kubectl -n orthid rollout status deploy/orthid-control-plane
# 2. Migrations applied at the expected schema version.
kubectl -n orthid logs job/orthid-migrate | tail -n 5
# 3. Readiness: database, storage, and secrets all connected.
curl -s https://id.acme.health/readyz | jq
# { "status": "ready", "region": "au-syd-1",
# "checks": { "database": "ok", "storage": "ok", "secrets": "ok" } }
# 4. TLS terminates and the right origin is served.
curl -sI https://id.acme.health/healthz | head -n 1
# HTTP/2 200/readyz, not /healthz. A pod can be alive but unable to reach Postgres or the key provider; routing to it would fail sign-ins. /readyz only returns ready when every dependency is connected.