Overview#

Documentation Verified Last checked: 2025-11-25 Reviewer: Christof Buchbender

What is ops-db?#

ops-db is the core PostgreSQL database schema and SQLAlchemy ORM layer that serves as the operational brain of the CCAT Data Center. It tracks everything from observation planning → execution → data movement → archival → publication.

The database is implemented using SQLAlchemy ORM, providing a Python interface to all operational data. It serves as the backend for two key interfaces:

  • ops-db-api - RESTful API for programmatic access by instruments and automated systems

  • ops-db-ui - Web interface for browsing observations and monitoring system state

Design Philosophy#

Single Source of Truth

All operational metadata lives in ops-db, avoiding duplication across systems. This ensures consistency and simplifies data governance.

Polymorphic Models

Many entity types use SQLAlchemy’s polymorphic inheritance to accommodate different subtypes:

Physical Copy Tracking

The database tracks not just what data exists, but WHERE it physically exists through PhysicalCopy models. This enables safe deletion, staged unpacking, and complete audit trails.

Status-Driven Workflows

Entities use Status enums (Status, PackageState, PhysicalCopyStatus) to track processing state, enabling automated workflows and retry logic.

Relationship-Rich

Heavy use of SQLAlchemy relationships maintains referential integrity and enables efficient queries across related entities.

Major Entity Categories#

The database organizes data into several major categories:

Observatory Infrastructure

The physical telescope, instruments, and modules that produce data: Observatory, Telescope, Instrument, InstrumentModule. See Observatory Hierarchy for details.

Scientific Planning

Observing programs, sub-programs, and observation units that define what to observe: ObservingProgram, SubObservingProgram, ObsUnit, Source, ObsMode. See Observation Model for details.

Execution Tracking

Records of actual observations with timing, conditions, and status: ExecutedObsUnit. See Observation Model for details.

Data Management

Files, packages, and physical copies across multiple storage locations: RawDataFile, RawDataPackage, DataTransferPackage, PhysicalCopy. See Data Model for details.

Transfer Infrastructure

Sites, locations, routes that define how data moves through the system: Site, DataLocation, DataTransferRoute. See Location Model and Transfer Model for details.

Archival & Staging

Long-term archive transfers and staging jobs for processing: LongTermArchiveTransfer, StagingJob. See Transfer Model for details.

Access Control

Users, roles, and API tokens for authentication and authorization: User, Role, ApiToken.

How Data Flows Through ops-db#

Conceptually, data flows through ops-db as follows:

  1. Planning - Observing programs and observation units are added prior to observations.

  2. Execution - Telescope systems create ExecutedObsUnit records when observations run

  3. Data Registration - Raw data files are registered and linked to executed observations

  4. Packaging - Files are bundled into RawDataPackage for efficient archiving and transfer

  5. Transfer - Packages are transferred between sites via DataTransferPackage and DataTransfer records

  6. Archive - Packages are archived to long-term storage via LongTermArchiveTransfer

  7. Physical Copies - PhysicalCopy records track where each file/package exists at each stage

For detailed workflow documentation, see the Data Transfer System documentation.

What ops-db Does NOT Contain#

ops-db is a metadata database - it tracks information about data, not the data itself:

  • Actual data files - Files are stored on disk/S3/tape; ops-db just tracks metadata and locations

  • Processing results - Processed data is equally stored on disk/S3/tape; ops-db just tracks metadata and locations

  • Real-time telescope telemetry - That is tracked in our housekeeping system (InfluxDB)

  • Long log files - Logs are stored on disk; ops-db has references to log file paths

Integration Points#

ops-db integrates with several other CCAT components:

  • ops-db-api - Provides RESTful endpoints for programmatic access to the database

  • ops-db-ui - Provides a web interface for browsing and managing database records

  • data-transfer - Reads/writes transfer and archive records, orchestrates actual file movements

  • system-integration - Handles deployment and infrastructure setup

For details on integration, see Related Components.

Entity Relationships#

        graph TB
    subgraph Infrastructure["Observatory Infrastructure"]
        OBS[Observatory]
        TEL[Telescope]
        INST[Instrument]
        MOD[InstrumentModule]
        OBS --> TEL
        TEL --> INST
        INST --> MOD
    end

    subgraph Planning["Scientific Planning"]
        PROG[ObservingProgram]
        SUB[SubObservingProgram]
        OU[ObsUnit]
        SRC[Source]
        PROG --> SUB
        PROG --> OU
        SUB --> OU
        SRC --> OU
    end

    subgraph Execution["Execution"]
        EOU[ExecutedObsUnit]
        OU --> EOU
    end

    subgraph Data["Data Management"]
        RDF[RawDataFile]
        RDP[RawDataPackage]
        DTP[DataTransferPackage]
        PC[PhysicalCopy]
        EOU --> RDF
        EOU --> RDP
        RDF --> RDP
        RDP --> DTP
        RDF --> PC
        RDP --> PC
        DTP --> PC
    end

    subgraph Locations["Storage Locations"]
        SITE[Site]
        DL[DataLocation]
        SITE --> DL
        PC --> DL
    end

    MOD --> RDF
    MOD --> RDP
    

Next Steps#

Now that you understand the high-level architecture: