# Design Rationale

```{eval-rst}
.. verified:: 2025-11-12
   :reviewer: Christof Buchbender
```

This document explains the key design decisions behind the ops-db-api architecture and why alternative approaches were rejected.

```{contents} Table of Contents
:depth: 2
:local: true
```

## The Remote Observatory Challenge

Understanding why this architecture exists requires understanding the environment:

### Physical Constraints

The CCAT observatory site:

- **Altitude**: 5600m (18,400 feet) - extreme environment
- **Location**: Atacama Desert, Chile - remote, harsh conditions
- **Distance**: 11,000+ km from main database in Cologne, Germany
- **Connectivity**: Fiber
- **Bandwidth**: Limited when available
- **Latency**: 200-500ms when connected, infinite when not

### The Unacceptable Solution

**What doesn't work**: Direct writes to remote database

```python
# This fails during network outage
def record_observation(obs_data):
    try:
        db.insert(obs_data)  # Blocks until remote DB responds
        db.commit()          # Fails if network down
    except NetworkError:
        # What now? Observation already happened!
        # Data is lost or requires manual intervention
        raise
```

**Problems**:

1. Blocks observation workflow on network latency
2. Fails completely during outages
3. No recovery mechanism
4. Lost data requires manual intervention

## Why Redis for Buffering?

We chose Redis over other buffering solutions for specific reasons:

### Redis Advantages

**1. In-Memory Speed**

> - Sub-millisecond write latency
> - No disk I/O blocking
> - Observation recording never waits

**2. Persistence Options**

> - AOF (Append-Only File) for durability
> - Configurable fsync policies
> - Survives process restart

**3. Data Structures**

> - Lists for transaction queue (LPUSH/RPOP)
> - Hashes for transaction metadata
> - Sets for tracking state
> - Pub/Sub for real-time updates

**4. Simple Deployment**

> - Single binary, minimal dependencies
> - Low resource overhead
> - Proven reliability

**5. Atomic Operations**

> - Transaction queuing is atomic
> - No race conditions
> - Safe for concurrent access

### Alternatives Considered

**RabbitMQ / Kafka**

> - **Pro**: Purpose-built for message queuing
> - **Pro**: Better durability guarantees
> - **Con**: Much heavier weight
> - **Con**: More complex deployment
> - **Con**: Overkill for single-site buffering

**Local SQLite**

> - **Pro**: SQL interface
> - **Pro**: ACID guarantees
> - **Con**: Disk I/O latency
> - **Con**: Locking under concurrent access
> - **Con**: No built-in pub/sub for updates

**PostgreSQL Queue Table**

> - **Pro**: Same database system
> - **Pro**: ACID guarantees
> - **Con**: Adds load to database
> - **Con**: Defeats purpose (still requires DB connection)
> - **Con**: Slower than in-memory

**File System Queue**

> - **Pro**: Simple, no dependencies
> - **Pro**: Durable by default
> - **Con**: No atomicity guarantees
> - **Con**: Race conditions on concurrent access
> - **Con**: No built-in monitoring

**Decision**: Redis provides the best balance of speed, reliability, and simplicity for our use case.

## Why PostgreSQL Streaming Replication?

We use PostgreSQL's built-in streaming replication rather than application-level sync:

### Streaming Replication Benefits

**1. Built-in Mechanism**

> - Native PostgreSQL feature
> - Battle-tested in production
> - No custom sync logic needed

**2. Log Sequence Numbers (LSN)**

> - Precise replication state tracking
> - Know exactly what data has replicated
> - Enable smart cache management

**3. Read-Only Replicas**

> - Safe by design (can't accidentally write)
> - Multiple replicas possible
> - No conflict resolution needed

**4. Continuous Sync**

> - Changes stream as they happen
> - No batch delays
> - Minimal lag when network available

**5. Automatic Catchup**

> - Replica auto-catches up after network restoration
> - Uses WAL (Write-Ahead Log) for efficiency
> - No manual intervention required

### Why Not Application-Level Sync?

**Alternative**: Sync tables at application level (e.g., with timestamps)

**Problems**:

1. **Conflict Resolution**: What if same record modified at both sites?
2. **Cascading Updates**: Foreign key relationships become complex
3. **Schema Changes**: Every DB change needs sync logic update
4. **Partial Failure**: Some tables sync, others don't - inconsistent state
5. **Performance**: Full table scans or complex timestamp tracking

**PostgreSQL streaming replication** avoids all these issues by replicating at the
database engine level.

## Why Eventual Consistency?

We explicitly choose eventual consistency over strong consistency:

### The Trade-Off

**Strong Consistency** (rejected):

> - Every write waits for confirmation from all replicas
> - Guarantees immediate consistency everywhere
> - **Blocks when network down** ❌

**Eventual Consistency** (chosen):

> - Writes succeed locally, sync in background
> - Data becomes consistent "eventually" (seconds to minutes)
> - **Never blocks operations** ✅

### Why This Is Acceptable

**For Observatory Operations**:

- Recording observation at 00:00:00 doesn't need to be visible in Cologne at 00:00:01
- Scientists querying data hours/days later don't care about seconds of lag
- **Reliability > Immediate Consistency**

**For UI Users**:

- Transfer monitoring: Seconds of lag is imperceptible
- Data browsing: Minutes-old data is fine for most use cases
- Real-time updates: WebSockets provide immediate local feedback

**Critical**: We track replication state precisely with LSN, so we *know* when consistency is achieved.

## Why Not Multi-Master?

**Multi-Master Replication** (rejected): Both sites can write, changes sync bidirectionally

**Sounds Good But**:

1. **Conflict Resolution**: What if both sites create same ID?

   ```sql
   -- Site A creates:
   INSERT INTO obs (id=123, status='running');

   -- Simultaneously, Site B creates:
   INSERT INTO obs (id=123, status='completed');

   -- Which one wins when they sync?
   ```

2. **Complex Logic**: Need custom resolution for every conflict type

3. **Partial Updates**: Site A updates field X, Site B updates field Y - merge how?

4. **Schema Evolution**: Database changes must work on both sites simultaneously

5. **Testing Complexity**: Every scenario needs conflict resolution test

**Our Single-Master Approach**:

- All writes go to Cologne (authoritative)
- Observatory buffers and forwards
- No conflicts possible
- Simple, predictable behavior

### Cost of This Approach

We pay a cost for this simplicity:

- Observatory can't immediately see writes from Cologne (read-only replica)
- Network partition means buffered data accumulates
- Main site failure blocks all writes (but buffering prevents data loss)

**We accept these costs** because the alternative (conflict resolution) is far more
complex and error-prone. Writes from Cologne are not needed immediately.

## Technology Choices

### Framework: FastAPI

**Why FastAPI**:

- Modern async Python framework
- Automatic OpenAPI documentation
- Type hints and validation (Pydantic)
- WebSocket support built-in
- High performance (comparable to Go, Node.js)

**Why not Django/Flask**:

- Django: Too heavyweight, REST framework less modern
- Flask: Lacks async support, no automatic docs

### ORM: SQLAlchemy

**Why SQLAlchemy**:

- Industry standard Python ORM
- Powerful query builder
- Async support (SQLAlchemy 1.4+)
- Used by ops-db package (consistency)

**Why not Django ORM**:

- Tied to Django framework
- Less flexible for complex queries

### Database: PostgreSQL

**Why PostgreSQL**:

- Streaming replication built-in
- LSN tracking for replication state
- JSON/JSONB for flexible schemas
- Proven at scale
- Excellent ops-db package support

**Why not MySQL/MariaDB**:

- Replication less sophisticated
- No LSN equivalent
- Weaker JSON support

### Cache/Buffer: Redis

**Why Redis**:

- Discussed in detail above
- In-memory speed
- Versatile data structures
- Simple deployment

### Language: Python

**Why Python**:

- Astronomy/science community standard
- Rich ecosystem (astropy, numpy, etc.)
- Easy for scientists to extend
- Same language as ops-db package

## Summary

Every major design decision has a rationale:

- **Unified API**: Shared infrastructure, planned split later
- **Redis Buffering**: Speed + reliability + simplicity
- **PostgreSQL Replication**: Built-in, LSN tracking, proven
- **Eventual Consistency**: Reliability over immediate consistency
- **Single Master**: Simplicity over conflict resolution
- **FastAPI + SQLAlchemy + PostgreSQL + Redis**: Modern, proven, Python ecosystem

These decisions optimize for the **critical requirement**: **Observatory operations must
never fail due to network issues.**

## Next Steps

- {doc}`distributed-architecture` - How the database topology works
- {doc}`reliability-first` - Deep dive into reliability mechanisms