# Location Model

```{eval-rst}
.. verified:: 2025-11-25
   :reviewer: Christof Buchbender
```

CCAT data centers are geographically distributed (Chile, Cologne, Cornell). Data must be
tracked across multiple sites, each with multiple storage locations of different types.

## Site

{py:class}`~ccat_ops_db.models.Site` groups data locations that belong to the same
physical or logical location.

**Examples**:
: - CCAT (short_name: ccat): Cerro Chajnantor, Chile - telescope location
  - Cologne (short_name: cologne): University of Cologne, Germany - primary archive
  - Cornell (short_name: us): Cornell University, USA - US archive

For complete attribute details, see {py:class}`~ccat_ops_db.models.Site`.

## DataLocation

{py:class}`~ccat_ops_db.models.DataLocation` is the base class for all storage locations
with polymorphic storage types. It defines WHERE data can be stored within a site.

**LocationType Enum**: SOURCE (telescope instrument computers), BUFFER (intermediate storage),
LONG_TERM_ARCHIVE (permanent storage), PROCESSING (temporary analysis areas).

**StorageType Enum**: DISK (traditional filesystem), S3 (object storage), TAPE (tape libraries).

For complete attribute details, see {py:class}`~ccat_ops_db.models.DataLocation`.

## Polymorphic Storage Types

The database uses polymorphic inheritance to support different storage backends:

```{eval-rst}
.. mermaid::

   graph TB
       DL[DataLocation<br/>Base Class]
       DISK[DiskDataLocation]
       S3[S3DataLocation]
       TAPE[TapeDataLocation]

       DL -->|polymorphic| DISK
       DL -->|polymorphic| S3
       DL -->|polymorphic| TAPE

       style DL fill:#e1f5ff
       style DISK fill:#fff4e1
       style S3 fill:#ffe1f5
       style TAPE fill:#e1ffe1
```

### DiskDataLocation

{py:class}`~ccat_ops_db.models.DiskDataLocation` represents filesystem-based storage
(local or remote). Used for local telescope storage, network-mounted buffers, and processing areas.

**Example**: FYST source location at "telescope.ccat.cl:/data/fyst"

For complete attribute details, see {py:class}`~ccat_ops_db.models.DiskDataLocation`.

### S3DataLocation

{py:class}`~ccat_ops_db.models.S3DataLocation` represents object storage for large-scale
archival. Used for long-term archives and cloud storage. Credentials are retrieved via
{py:func}`~ccat_ops_db.models.S3DataLocation.get_s3_credentials` method using environment variable patterns.

**Example**: Cologne long-term archive using Coscine S3-compatible storage

For complete attribute details, see {py:class}`~ccat_ops_db.models.S3DataLocation`.

### TapeDataLocation

{py:class}`~ccat_ops_db.models.TapeDataLocation` represents tape library systems for
deep archival. Used for long-term cold storage with high capacity and low access frequency.
Not currently in production, but supported by the architecture.

For complete attribute details, see {py:class}`~ccat_ops_db.models.TapeDataLocation`.

## Buffer Hierarchy and Failover

Multiple buffer locations can exist at a site, enabling failover and load distribution.

**Active Flag**: Indicates if location is operational

**Priority Field**: Defines failover order (lower number = higher priority)

**Use Case**: If primary buffer is full or offline, data-transfer can route to secondary
buffer.

**Example**:
: - cologne_buffer_1 (priority 0, active=True) - Primary buffer
  - cologne_buffer_2 (priority 1, active=True) - Secondary buffer

The system uses:
: - **Priority** (lower number = higher priority): Determines which location to use first
  - **Active** flag: Allows temporarily disabling locations for maintenance

## Example Locations

```{eval-rst}
.. list-table:: Example Data Locations
   :header-rows: 1
   :widths: 20 20 20 20 20

   * - Site
     - Name
     - LocationType
     - StorageType
     - Path/Details
   * - CCAT
     - fyst_source
     - SOURCE
     - DISK
     - telescope.ccat.cl:/data/fyst
   * - Cologne
     - cologne_buffer_1
     - BUFFER
     - DISK
     - buffer.data.uni-koeln.de:/mnt/buffer
   * - Cologne
     - cologne_lta
     - LONG_TERM_ARCHIVE
     - S3
     - bucket: ccat-archive
   * - Cologne
     - ramses_processing
     - PROCESSING
     - DISK
     - ramses.cluster:/scratch/ccat
```

## Why This Structure?

**Polymorphic Design**

: Allows different storage backends without changing core logic. The same code can work
  with disk, S3, or tape storage.

**Site Grouping**

: Enables geographic routing and replication strategies. Data can be replicated across
  multiple sites for redundancy.

**Location Type vs Storage Type**
: - `location_type` captures functional role (where in the workflow)
  - `storage_type` captures technical implementation
  - Separation allows flexibility: A BUFFER location could be DISK or S3 depending on
    site infrastructure

**Active/Priority Fields**

: Enable dynamic routing and failover without code changes. Locations can be disabled
  for maintenance or prioritized based on capacity.

## Integration with Physical Copies

Each {py:class}`~ccat_ops_db.models.PhysicalCopy` references a
{py:class}`~ccat_ops_db.models.DataLocation`. The `full_path` property combines:

- For {py:class}`~ccat_ops_db.models.DiskDataLocation`: `DataLocation.path + file.relative_path`
- For {py:class}`~ccat_ops_db.models.S3DataLocation`: `DataLocation.bucket_name + file.relative_path` (S3 key)
- For {py:class}`~ccat_ops_db.models.TapeDataLocation`: `DataLocation.mount_path + file.relative_path`

## Geographic Distribution

Storage locations currently span multiple sites:

- **CCAT Observatory (Chile)** - SOURCE and BUFFER locations at telescope
- **University of Cologne (Germany)** - BUFFER, LONG_TERM_ARCHIVE, and PROCESSING (RAMSES)
- **Cornell University (USA)** - Future archive site

**Future Expansion**: The architecture supports additional sites and multi-tiered transfer
routing (e.g., Chile → Cologne → Cornell).

## Related Documentation

- Complete API reference: {doc}`../api_reference/models`
- Transfer model: {doc}`transfer_model`
- Data model: {doc}`data_model`