Overview#
What is ops-db?#
ops-db is the core PostgreSQL database schema and SQLAlchemy ORM layer that serves as the operational brain of the CCAT Data Center. It tracks everything from observation planning → execution → data movement → archival → publication.
The database is implemented using SQLAlchemy ORM, providing a Python interface to all operational data. It serves as the backend for two key interfaces:
ops-db-api - RESTful API for programmatic access by instruments and automated systems
ops-db-ui - Web interface for browsing observations and monitoring system state
Design Philosophy#
- Single Source of Truth
All operational metadata lives in ops-db, avoiding duplication across systems. This ensures consistency and simplifies data governance.
- Polymorphic Models
Many entity types use SQLAlchemy’s polymorphic inheritance to accommodate different subtypes:
SourcehasFixedSource,SolarSystemObject, andConstantElevationSourcesubtypesDataLocationhasDiskDataLocation,S3DataLocation, andTapeDataLocationsubtypesPhysicalCopyhasRawDataFilePhysicalCopy,RawDataPackagePhysicalCopy, andDataTransferPackagePhysicalCopysubtypes
- Physical Copy Tracking
The database tracks not just what data exists, but WHERE it physically exists through
PhysicalCopymodels. This enables safe deletion, staged unpacking, and complete audit trails.- Status-Driven Workflows
Entities use Status enums (
Status,PackageState,PhysicalCopyStatus) to track processing state, enabling automated workflows and retry logic.- Relationship-Rich
Heavy use of SQLAlchemy relationships maintains referential integrity and enables efficient queries across related entities.
Major Entity Categories#
The database organizes data into several major categories:
- Observatory Infrastructure
The physical telescope, instruments, and modules that produce data:
Observatory,Telescope,Instrument,InstrumentModule. See Observatory Hierarchy for details.- Scientific Planning
Observing programs, sub-programs, and observation units that define what to observe:
ObservingProgram,SubObservingProgram,ObsUnit,Source,ObsMode. See Observation Model for details.- Execution Tracking
Records of actual observations with timing, conditions, and status:
ExecutedObsUnit. See Observation Model for details.- Data Management
Files, packages, and physical copies across multiple storage locations:
RawDataFile,RawDataPackage,DataTransferPackage,PhysicalCopy. See Data Model for details.- Transfer Infrastructure
Sites, locations, routes that define how data moves through the system:
Site,DataLocation,DataTransferRoute. See Location Model and Transfer Model for details.- Archival & Staging
Long-term archive transfers and staging jobs for processing:
LongTermArchiveTransfer,StagingJob. See Transfer Model for details.- Access Control
Users, roles, and API tokens for authentication and authorization:
User,Role,ApiToken.
How Data Flows Through ops-db#
Conceptually, data flows through ops-db as follows:
Planning - Observing programs and observation units are added prior to observations.
Execution - Telescope systems create
ExecutedObsUnitrecords when observations runData Registration - Raw data files are registered and linked to executed observations
Packaging - Files are bundled into
RawDataPackagefor efficient archiving and transferTransfer - Packages are transferred between sites via
DataTransferPackageandDataTransferrecordsArchive - Packages are archived to long-term storage via
LongTermArchiveTransferPhysical Copies -
PhysicalCopyrecords track where each file/package exists at each stage
For detailed workflow documentation, see the Data Transfer System documentation.
What ops-db Does NOT Contain#
ops-db is a metadata database - it tracks information about data, not the data itself:
Actual data files - Files are stored on disk/S3/tape; ops-db just tracks metadata and locations
Processing results - Processed data is equally stored on disk/S3/tape; ops-db just tracks metadata and locations
Real-time telescope telemetry - That is tracked in our housekeeping system (InfluxDB)
Long log files - Logs are stored on disk; ops-db has references to log file paths
Integration Points#
ops-db integrates with several other CCAT components:
ops-db-api - Provides RESTful endpoints for programmatic access to the database
ops-db-ui - Provides a web interface for browsing and managing database records
data-transfer - Reads/writes transfer and archive records, orchestrates actual file movements
system-integration - Handles deployment and infrastructure setup
For details on integration, see Related Components.
Entity Relationships#
graph TB
subgraph Infrastructure["Observatory Infrastructure"]
OBS[Observatory]
TEL[Telescope]
INST[Instrument]
MOD[InstrumentModule]
OBS --> TEL
TEL --> INST
INST --> MOD
end
subgraph Planning["Scientific Planning"]
PROG[ObservingProgram]
SUB[SubObservingProgram]
OU[ObsUnit]
SRC[Source]
PROG --> SUB
PROG --> OU
SUB --> OU
SRC --> OU
end
subgraph Execution["Execution"]
EOU[ExecutedObsUnit]
OU --> EOU
end
subgraph Data["Data Management"]
RDF[RawDataFile]
RDP[RawDataPackage]
DTP[DataTransferPackage]
PC[PhysicalCopy]
EOU --> RDF
EOU --> RDP
RDF --> RDP
RDP --> DTP
RDF --> PC
RDP --> PC
DTP --> PC
end
subgraph Locations["Storage Locations"]
SITE[Site]
DL[DataLocation]
SITE --> DL
PC --> DL
end
MOD --> RDF
MOD --> RDP
Next Steps#
Now that you understand the high-level architecture:
Learn the observatory hierarchy: See Observatory Hierarchy
Understand observation planning: See Observation Model
Explore data tracking: See Data Model
Learn about storage locations: See Location Model
Understand data transfer: See Transfer Model
Browse the complete API: See Models API Reference