Data Center Operations#

Note

Audience: Infrastructure and DevOps teams

This section covers deployment, infrastructure management, monitoring, and operational procedures for the CCAT Data Center.

Overview#

The CCAT Data Center operations team is responsible for:

  • Infrastructure Management - Maintaining servers, storage, and network infrastructure

  • Deployment & Configuration - Managing Docker, Kubernetes, and service deployments

  • Monitoring & Observability - Ensuring system health with Grafana, and Loki

  • Security & Secrets - Managing credentials and access control via Infisical

  • Backup & Recovery - Implementing data protection and disaster recovery procedures

  • Team Coordination - Following collaborative workflows and documentation practices

System Environments#

The data center supports multiple deployment environments:

  • Production - Live system serving telescope operations

  • Staging - Testing environment with production-like configuration

  • Local Development - Docker Compose setup for development and testing

Key Technologies#

  • Containerization - Docker and Docker Compose

  • Orchestration - Kubernetes

  • Monitoring - Grafana, Loki, Promtail

  • Databases - PostgreSQL, InfluxDB, Redis

  • Object Storage - MinIO, Coscine (S3 compatible)

  • Secrets Management - Infisical

  • CI/CD - GitHub Actions, Jenkins

  • Configuration Management - Ansible

For component-specific operational details, see the Component Developer Documentation section.