Deployment#

✓

Documentation Verified Last checked: 2026-02-23 Reviewer: Christof Buchbender

Overview#

Deployments are driven by a causal build-state orchestrator in ci/check_builds.py, triggered via GitHub Actions (orchestrate-deploy.yml) and executed by Jenkins.

Repos are divided into deploy groups. Each group has its own gate check and its own Jenkins job — a docs rebuild never blocks or is blocked by a main-stack rebuild.

Deploy groups#
Group	Repos	Jenkins jobs
`default`	ops-db, ops-db-api, data-transfer, ops-db-ui	`deploy-staging` / `deploy-production`
`docs`	data-center-documentation	`deploy-data-center-documentation-develop` / `deploy-data-center-documentation-production`

Main-stack deploy (`default` group)#

A developer pushes to one of the main-stack repos.
GitHub Actions builds and pushes the Docker image, then calls notify-build-complete.yml which sends a repository_dispatch (type build-complete) to system-integration.
orchestrate-deploy.yml runs (serialised per branch via a concurrency group):
1. Update state — records sha, image_digest, and built_with in BUILD_STATE_JSON_<BRANCH>.
2. Cascade — dispatches workflow_dispatch to immediate downstream repos so they rebuild against the new upstream image.
3. Cross-group dispatch — dispatches update-submodules.yml in data-center-documentation for each cross_group_trigger entry in dependency-graph.yml.
4. Gate check — resolves the triggering repo’s deploy group, then verifies every repo in that group is causally consistent (built_with matches current upstream digest). If all green, triggers the Jenkins job via its REST API.
Jenkins pipeline (deploy_staging / deploy_production):
- SSHs into input-b.{staging.}data.ccat.uni-koeln.de
- Runs Alembic migrations (ops-db pipeline only)
- docker compose pull + up -d (main stack compose file)
- Smoke test

Docs deploy (`docs` group)#

The docs stack is independent — it has its own compose files (docker-compose.docs.input-b.yml / docker-compose.docs.staging.input-b.yml) and its own Jenkins jobs.

The trigger chain:

Any of the four tracked upstream repos publishes a new image.
The orchestrator fires a cross-group workflow_dispatch to data-center-documentation, targeting update-submodules.yml.
update-submodules.yml bumps all tracked submodule pointers to their latest HEAD on the dispatched branch, then commits and pushes (if anything changed).
The push triggers docker-docs.yml, which builds and pushes the docs image and calls notify-build-complete with the submodule SHAs as built_with_json.
The orchestrator receives the build-complete event, updates state, and runs the gate check for the docs group. The gate uses source_commit comparison: built_with[dep] is compared against state[dep].sha (git commit SHA of each submodule pointer) rather than an image digest.
If all four submodule SHAs match, the Jenkins docs deploy job is triggered.

Config-only updates#

When only config changes are pushed (e.g., Docker Compose files, environment variables, monitoring configs) without rebuilding container images, a lightweight update pipeline automatically syncs those changes to all data-center machines without waiting for a full deploy cycle.

Trigger:

Developer pushes to develop or main
GitHub Actions workflow update-system-integration fires (triggered via workflows/update-system-integration.yml)
Workflow runs on ccat-internal runner (VPN access to Jenkins)
Triggers Jenkins job:
- update-system-integration-staging (for develop branch)
- update-system-integration-production (for main branch)

Pipeline:

SSH into all machines:
- Staging: input-a, input-b, input-c on staging environment
- Production: input-a, input-b, input-c on production environment
On each machine, run:
```
cd /opt/data-center/system-integration
CI=true ccat update -y --no-image-pull
```
This:
- Refreshes GitHub authentication tokens (avoiding failures on short-lived HTTPS tokens)
- Pulls latest code from the repository
- Runs docker compose up -d to apply config changes immediately
- Does not pull new container images (unlike full ccat update)

Benefits:

Config changes are applied immediately without waiting for Docker builds
Avoids unnecessary rebuilds when only compose files or configs change
Maintains causal consistency: triggered directly by git push, not via orchestration state machine
Lightweight and fast: minimal overhead vs. full deploy cycle

Manual trigger (workflow_dispatch):

You can also manually trigger the workflow from the GitHub UI, selecting either staging or production environment.

To verify the update was applied:

ssh <user>@input-{a,b,c}.{staging.}data.ccat.uni-koeln.de
cd /opt/data-center/system-integration
git log -1
docker compose ps  # or: ccat ps

Causal consistency model#

Build state is stored in two GitHub repository variables: BUILD_STATE_JSON_DEVELOP and BUILD_STATE_JSON_MAIN. Each per-repo entry has the form:

{
  "sha":          "<git commit SHA>",
  "image_digest": "sha256:...",
  "ts":           "2026-01-01T00:00:00+00:00",
  "built_with":   { "ops-db": "sha256:..." }
}

The built_with_type field in ci/dependency-graph.yml controls what the gate check compares:

Type	Gate compares `built_with[dep]` against
`runtime`	`state[dep].image_digest`
`base_image`	`state[dep].image_digest`
`source_commit`	`state[dep].sha` (used by docs: submodule pointer vs upstream git SHA)

Redis TLS Certificate Management#

Redis connections between machines use mutual TLS. Four cert variants exist, one per Redis instance:

Variant	Redis server	Deployed to
`main`	input-b (production)	input-a, input-b, input-c, reuna
`ccat`	reuna (Chile)	input-a, input-b, input-c, reuna
`develop`	input-b.staging	input-a.staging, input-b.staging, input-c.staging
`develop-ccat`	(future staging Chile)	input-a.staging, input-b.staging, input-c.staging

All production machines receive both main and ccat certs so that services can connect to either Redis instance without future cert changes.

File permissions on remote hosts:

File	Mode	Owner	Notes
`ca.crt`	644	root:root	Public; establishes trust in the server cert
`client.crt`	644	root:root	Presented by all clients during handshake
`client.key`	644	root:root	World-readable — required by multi-UID containers
`redis.crt`	644	root:root	Server cert; only on server machines
`redis.key`	600	999:root	UID 999 = Redis container user; only on server machines
`ca.key`	—	local only	Never deployed; only needed for cert signing

Certs are deployed to /opt/redis-certs/{variant}/ on each machine.

Workflow#

The ccat redis-certs sub-command handles the full lifecycle:

# 1. Generate certs locally (runs 8 openssl steps, sets permissions)
ccat redis-certs generate main

# 2. Check sync status before distributing
ccat redis-certs status --variant main

# 3. Dry-run first
ccat redis-certs distribute --variant main --dry-run

# 4. Deploy to all target hosts via Ansible
ccat redis-certs distribute --variant main

# 5. Verify all machines are in sync
ccat redis-certs status --variant main

# Rotate (regenerate + distribute in one step)
ccat redis-certs rotate --variant main

The status command shows a Rich table comparing the local ca.crt SHA256 fingerprint against the fingerprint read from each remote host via Ansible ad-hoc. Use --verbose to see the full fingerprint instead of the last 16 characters.

After distributing new certs, restart Redis on each affected machine:

ssh <user>@input-b.data.ccat.uni-koeln.de
cd /opt/data-center/system-integration
docker compose restart redis

Ansible role#

The redis_certs role (ansible/roles/redis_certs/) is included in the ccat, input_staging, and input_ccat plays in playbook.yml and can be run in isolation via:

ccat provision --group input_ccat --tag redis_certs

Host-specific behaviour (client vs. server) is controlled by two variables:

redis_cert_variants — which variants to deploy on this host (set in group_vars/<group>/vars_redis_certs.yml)
redis_cert_server_variants — variants where this host is the Redis server (set in host_vars/<host>/redis_certs.yml for input-b and input-b.staging; overrides the group-level empty default)

Local Development#

git clone <repository_url>
cd system-integration
git submodule update --init --recursive
cp .env.example .env
# Edit .env — set POSTGRES_PASSWORD, REDIS_PASSWORD, etc.
ccat update   # or: make start_main

Staging Environment#

Staging runs on input-b.staging.data.ccat.uni-koeln.de.

Jenkins deploy-staging / deploy-data-center-documentation-develop handle staging deploys automatically when the gate check passes on the develop branch.

To deploy manually:

ssh <user>@input-b.staging.data.ccat.uni-koeln.de
cd /opt/data-center/system-integration
docker compose -f docker-compose.staging.input-b.yml pull
docker compose -f docker-compose.staging.input-b.yml up -d

Production Deployment#

Production runs on input-b.data.ccat.uni-koeln.de.

Jenkins deploy-production / deploy-data-center-documentation-production handle production deploys automatically when the gate check passes on the main branch.