Design Rationale#
This document explains the key design decisions behind the ops-db-api architecture and why alternative approaches were rejected.
The Remote Observatory Challenge#
Understanding why this architecture exists requires understanding the environment:
Physical Constraints#
The CCAT observatory site:
Altitude: 5600m (18,400 feet) - extreme environment
Location: Atacama Desert, Chile - remote, harsh conditions
Distance: 11,000+ km from main database in Cologne, Germany
Connectivity: Fiber
Bandwidth: Limited when available
Latency: 200-500ms when connected, infinite when not
The Unacceptable Solution#
What doesn’t work: Direct writes to remote database
# This fails during network outage
def record_observation(obs_data):
try:
db.insert(obs_data) # Blocks until remote DB responds
db.commit() # Fails if network down
except NetworkError:
# What now? Observation already happened!
# Data is lost or requires manual intervention
raise
Problems:
Blocks observation workflow on network latency
Fails completely during outages
No recovery mechanism
Lost data requires manual intervention
Why Redis for Buffering?#
We chose Redis over other buffering solutions for specific reasons:
Redis Advantages#
1. In-Memory Speed
Sub-millisecond write latency
No disk I/O blocking
Observation recording never waits
2. Persistence Options
AOF (Append-Only File) for durability
Configurable fsync policies
Survives process restart
3. Data Structures
Lists for transaction queue (LPUSH/RPOP)
Hashes for transaction metadata
Sets for tracking state
Pub/Sub for real-time updates
4. Simple Deployment
Single binary, minimal dependencies
Low resource overhead
Proven reliability
5. Atomic Operations
Transaction queuing is atomic
No race conditions
Safe for concurrent access
Alternatives Considered#
RabbitMQ / Kafka
Pro: Purpose-built for message queuing
Pro: Better durability guarantees
Con: Much heavier weight
Con: More complex deployment
Con: Overkill for single-site buffering
Local SQLite
Pro: SQL interface
Pro: ACID guarantees
Con: Disk I/O latency
Con: Locking under concurrent access
Con: No built-in pub/sub for updates
PostgreSQL Queue Table
Pro: Same database system
Pro: ACID guarantees
Con: Adds load to database
Con: Defeats purpose (still requires DB connection)
Con: Slower than in-memory
File System Queue
Pro: Simple, no dependencies
Pro: Durable by default
Con: No atomicity guarantees
Con: Race conditions on concurrent access
Con: No built-in monitoring
Decision: Redis provides the best balance of speed, reliability, and simplicity for our use case.
Why PostgreSQL Streaming Replication?#
We use PostgreSQL’s built-in streaming replication rather than application-level sync:
Streaming Replication Benefits#
1. Built-in Mechanism
Native PostgreSQL feature
Battle-tested in production
No custom sync logic needed
2. Log Sequence Numbers (LSN)
Precise replication state tracking
Know exactly what data has replicated
Enable smart cache management
3. Read-Only Replicas
Safe by design (can’t accidentally write)
Multiple replicas possible
No conflict resolution needed
4. Continuous Sync
Changes stream as they happen
No batch delays
Minimal lag when network available
5. Automatic Catchup
Replica auto-catches up after network restoration
Uses WAL (Write-Ahead Log) for efficiency
No manual intervention required
Why Not Application-Level Sync?#
Alternative: Sync tables at application level (e.g., with timestamps)
Problems:
Conflict Resolution: What if same record modified at both sites?
Cascading Updates: Foreign key relationships become complex
Schema Changes: Every DB change needs sync logic update
Partial Failure: Some tables sync, others don’t - inconsistent state
Performance: Full table scans or complex timestamp tracking
PostgreSQL streaming replication avoids all these issues by replicating at the database engine level.
Why Eventual Consistency?#
We explicitly choose eventual consistency over strong consistency:
The Trade-Off#
Strong Consistency (rejected):
Every write waits for confirmation from all replicas
Guarantees immediate consistency everywhere
Blocks when network down ❌
Eventual Consistency (chosen):
Writes succeed locally, sync in background
Data becomes consistent “eventually” (seconds to minutes)
Never blocks operations ✅
Why This Is Acceptable#
For Observatory Operations:
Recording observation at 00:00:00 doesn’t need to be visible in Cologne at 00:00:01
Scientists querying data hours/days later don’t care about seconds of lag
Reliability > Immediate Consistency
For UI Users:
Transfer monitoring: Seconds of lag is imperceptible
Data browsing: Minutes-old data is fine for most use cases
Real-time updates: WebSockets provide immediate local feedback
Critical: We track replication state precisely with LSN, so we know when consistency is achieved.
Why Not Multi-Master?#
Multi-Master Replication (rejected): Both sites can write, changes sync bidirectionally
Sounds Good But:
Conflict Resolution: What if both sites create same ID?
-- Site A creates: INSERT INTO obs (id=123, status='running'); -- Simultaneously, Site B creates: INSERT INTO obs (id=123, status='completed'); -- Which one wins when they sync?
Complex Logic: Need custom resolution for every conflict type
Partial Updates: Site A updates field X, Site B updates field Y - merge how?
Schema Evolution: Database changes must work on both sites simultaneously
Testing Complexity: Every scenario needs conflict resolution test
Our Single-Master Approach:
All writes go to Cologne (authoritative)
Observatory buffers and forwards
No conflicts possible
Simple, predictable behavior
Cost of This Approach#
We pay a cost for this simplicity:
Observatory can’t immediately see writes from Cologne (read-only replica)
Network partition means buffered data accumulates
Main site failure blocks all writes (but buffering prevents data loss)
We accept these costs because the alternative (conflict resolution) is far more complex and error-prone. Writes from Cologne are not needed immediately.
Technology Choices#
Framework: FastAPI#
Why FastAPI:
Modern async Python framework
Automatic OpenAPI documentation
Type hints and validation (Pydantic)
WebSocket support built-in
High performance (comparable to Go, Node.js)
Why not Django/Flask:
Django: Too heavyweight, REST framework less modern
Flask: Lacks async support, no automatic docs
ORM: SQLAlchemy#
Why SQLAlchemy:
Industry standard Python ORM
Powerful query builder
Async support (SQLAlchemy 1.4+)
Used by ops-db package (consistency)
Why not Django ORM:
Tied to Django framework
Less flexible for complex queries
Database: PostgreSQL#
Why PostgreSQL:
Streaming replication built-in
LSN tracking for replication state
JSON/JSONB for flexible schemas
Proven at scale
Excellent ops-db package support
Why not MySQL/MariaDB:
Replication less sophisticated
No LSN equivalent
Weaker JSON support
Cache/Buffer: Redis#
Why Redis:
Discussed in detail above
In-memory speed
Versatile data structures
Simple deployment
Language: Python#
Why Python:
Astronomy/science community standard
Rich ecosystem (astropy, numpy, etc.)
Easy for scientists to extend
Same language as ops-db package
Summary#
Every major design decision has a rationale:
Unified API: Shared infrastructure, planned split later
Redis Buffering: Speed + reliability + simplicity
PostgreSQL Replication: Built-in, LSN tracking, proven
Eventual Consistency: Reliability over immediate consistency
Single Master: Simplicity over conflict resolution
FastAPI + SQLAlchemy + PostgreSQL + Redis: Modern, proven, Python ecosystem
These decisions optimize for the critical requirement: Observatory operations must never fail due to network issues.
Next Steps#
Distributed Architecture - How the database topology works
Reliability First - Deep dive into reliability mechanisms