# Data Transfer System
```{eval-rst}
.. verified:: 2025-10-16
:reviewer: Christof Buchbender
```
The CCAT Data Transfer System orchestrates the complete lifecycle of observatory data
from telescope acquisition through long-term archival storage (LTA) and to clean up of
upstream data upon successful archival. This system is designed to handle peak data
rates of 8TB/day across geographically distributed sites while ensuring data integrity,
availability, and efficient resource utilization.
```{toctree}
:caption: 'Contents:'
:hidden: true
:maxdepth: 2
source/philosophy
source/transfer_route_CCAT_Cologne
source/concepts
source/pipeline
source/routing
source/monitoring
source/lifecycle
source/api/index
```
## Overview
**What it does:**
The Data Transfer System manages the automated flow of raw astronomical data through
multiple processing stages:
- Package raw data files from instrument computers
- Transfer data packages between geographically distributed sites
- Verify data integrity at each stage
- Archive data to long-term storage (LTA)
- Stage data for scientific processing
- Clean up original and temporary storage based on retention policies
**Why it matters:**
With peak data rates of 8TB/day and operations spanning multiple continents, manual data
management would be error-prone, inefficient, and unable to meet scientific
requirements. The automated system ensures:
- **Reliability**: Data safely reaches long-term archives (LTA) without human intervention
- **Efficiency**: Intelligent routing and parallel processing maximize throughput
- **Integrity**: Multi-layer checksums catch corruption early
- **Resilience**: Automatic retry and recovery handle transient failures
- **Visibility**: Comprehensive monitoring provides operational insight
## Target Audience
This documentation is written for:
- **Developers** working on the data transfer system or integrating with it
- **CCAT team members** who want to understand how data flows through the observatory
- **Operations staff** who need conceptual understanding for troubleshooting
For related documentation and indepth guides on how to use and interface with the
system:
- **Instrument teams**: See {doc}`/source/instrument/integration` for API usage and data
filing
- **Scientists**: See {doc}`/source/scientists/guide` for accessing archived data
- **DevOps/Infrastructure**: See {doc}`/source/operations/datacenter_operations` for
deployment
## System Context
The Data Transfer System is one component of the larger CCAT Data Center:
```{eval-rst}
.. mermaid::
graph TD
subgraph Observatory["CCAT Observatory (Chile)"]
PrimeCam["Prime-Cam"]
CHAI["CHAI"]
InstComp["Instrument Computers"]
PrimeCam --> InstComp
CHAI --> InstComp
end
InstComp -->|Data Flow| DTS["Data Transfer System
• Package creation
• Site-to-site transfer
• Archive management"]
DTS --> Cologne["Cologne LTA
(Germany)
• Long-term archive
• Processing"]
DTS --> Cornell["optional e.g. Cornell LTA
(USA)
• Long-term archive
• Processing"]
style Observatory fill:#e8f5e8,stroke:#2e7d32,stroke-width:2px
style DTS fill:#e3f2fd,stroke:#1565c0,stroke-width:2px
style Cologne fill:#fff3e0,stroke:#ef6c00,stroke-width:2px
style Cornell fill:#fce4ec,stroke:#c2185b,stroke-width:2px
```
The system integrates with:
- {doc}`ops-db `: Tracks all data locations, operations, and metadata
- {doc}`ops-db-api `: Instruments use this to file new observations
- {doc}`ops-db-ui `: Provides visibility into system state
- {doc}`influxdb `: Stores transfer metrics and performance data
- {doc}`redis `: Coordinates distributed task execution
## Next Steps
**For Developers:**
- {doc}`source/api/index` - Complete API reference for all modules and functions
- {doc}`source/philosophy` - Understand the design principles and architectural decisions
- {doc}`source/concepts` - Learn the key concepts: Sites, DataLocations, Operations, Managers
**For Operations:**
- {doc}`source/pipeline` - Explore the 7-stage data processing pipeline
- {doc}`source/routing` - See how the system intelligently routes work to the right place
- {doc}`source/monitoring` - Learn about health checks, failure recovery, and observability
- {doc}`source/lifecycle` - Understand deletion policies and data lifecycle management