# Site Configuration ```{eval-rst} .. verified:: 2025-11-12 :reviewer: Christof Buchbender ``` The ops-db-api uses site-aware configuration to automatically adapt its behavior based on deployment location (main site vs secondary site). ```{contents} Table of Contents :depth: 2 :local: true ``` ## Overview **Site Type** determines how the API behaves: - **MAIN** site (Cologne): Direct database access, no buffering - **SECONDARY** site (Observatory): Buffered writes, replica reads The site configuration is managed by the `SiteConfig` class: ```{literalinclude} ../../ccat_ops_db_api/transaction_buffering/site_config.py :emphasize-lines: 1-4, 12-14, 56-60 :language: python :lines: 17-91 ``` ## Configuration Sources Settings are loaded from multiple sources (in order of precedence): 1. **Environment variables** (highest priority) 2. **\`\`.env\`\` file** (development) 3. **\`\`config/settings.toml\`\`** (application defaults) ### Example `.env` File ```bash # Site Identity SITE_NAME=observatory SITE_TYPE=secondary # "main" or "secondary" # Main Database (for writes) MAIN_DB_TYPE=postgresql MAIN_DB_HOST=main-db.example.com MAIN_DB_PORT=5432 MAIN_DB_USER=ccat_ops_user MAIN_DB_PASSWORD=secure_password MAIN_DB_NAME=ccat_ops_db # Local Database (for reads) LOCAL_DB_TYPE=postgresql LOCAL_DB_HOST=localhost LOCAL_DB_PORT=5432 LOCAL_DB_USER=ccat_ops_user LOCAL_DB_PASSWORD=secure_password LOCAL_DB_NAME=ccat_ops_db # Redis Configuration REDIS_HOST=localhost REDIS_PORT=6379 REDIS_DB=0 REDIS_PASSWORD= # Transaction Buffering CRITICAL_OPERATIONS_BUFFER=true TRANSACTION_BUFFER_SIZE=1000 TRANSACTION_RETRY_ATTEMPTS=3 TRANSACTION_RETRY_DELAY=5 BACKGROUND_PROCESSING_INTERVAL=1.0 # LSN Tracking LSN_TRACKING_ENABLED=true LSN_CHECK_INTERVAL=0.1 LSN_TIMEOUT=30 # Authentication SECRET_KEY=your-secret-key-change-in-production GITHUB_CLIENT_ID=your-github-oauth-client-id GITHUB_CLIENT_SECRET=your-github-oauth-client-secret ``` ### Example `settings.toml` ```toml [default] site_name = "institute" site_type = "main" [default.database] main_db_host = "localhost" main_db_port = 5432 local_db_host = "localhost" local_db_port = 5432 [default.redis] host = "localhost" port = 6379 [default.transaction_buffering] buffer_size = 1000 retry_attempts = 3 retry_delay = 5 background_processing_interval = 1.0 ``` ## Site Types ### MAIN Site Configuration **Typical deployment**: Cologne data center **Configuration**: ```bash SITE_NAME=cologne SITE_TYPE=main MAIN_DB_HOST=localhost # Local database LOCAL_DB_HOST=localhost # Same as main ``` **Behavior**: - All writes go directly to local database (no buffering) - All reads from local database - Redis used only for caching (not buffering) - Background processor disabled or minimal activity - LSN tracking not needed **Use cases**: - Production main site - Development when testing direct operations - Any location with reliable, low-latency access to main database ### SECONDARY Site Configuration **Typical deployment**: CCAT observatory, Chile **Configuration**: ```bash SITE_NAME=observatory SITE_TYPE=secondary MAIN_DB_HOST=main-db.example.com # Remote main database LOCAL_DB_HOST=localhost # Local read-only replica ``` **Behavior**: - Critical writes buffered in Redis, executed asynchronously - Non-critical writes fail or redirect to main - Reads from local replica merged with buffered data - Redis used for buffering, read buffer, and caching - Background processor actively processes buffer - LSN tracking monitors replication state **Use cases**: - Production observatory site - Development when testing transaction buffering - Any location with unreliable access to main database ## Operation Routing ### Site Configuration Logic The `should_buffer_operation()` method determines buffering: ```{literalinclude} ../../ccat_ops_db_api/transaction_buffering/site_config.py :language: python :lines: 142-164 ``` ### Decision Tree ```{eval-rst} .. mermaid:: graph TD Start[Operation Request] CheckSite{Site Type?} CheckOp{Operation Type?} CheckBuffer{Buffering
Enabled?} Start --> CheckSite CheckSite -->|MAIN| DirectWrite[Direct Write to DB] CheckSite -->|SECONDARY| CheckOp CheckOp -->|critical| CheckBuffer CheckOp -->|non-critical| DirectWrite CheckBuffer -->|true| Buffer[Buffer to Redis] CheckBuffer -->|false| DirectWrite Buffer --> Success[Return Immediately] DirectWrite --> Success style Buffer fill:#FFD700 style DirectWrite fill:#90EE90 ``` ### Database URL Selection The API selects appropriate database URLs based on site and operation: ```python def get_database_url(operation_type: str = "default") -> str: site_config = get_site_config() if site_config.is_main_site: # Main site: all operations use main database return site_config.get_main_database_url() else: # Secondary site: use local replica for reads # Background processor uses main for writes return site_config.get_local_database_url() ``` ## Redis Key Namespacing Redis keys are namespaced per site to prevent conflicts: ### Key Prefix Structure ```python # Base prefix f"site:{site_name}" # Transaction buffer f"site:{site_name}:transaction_buffer" # Failed transactions f"site:{site_name}:failed_transactions" # Transaction status f"site:{site_name}:transaction:{transaction_id}" # Cache f"site:{site_name}:cache:{cache_key}" ``` ### Example Keys For site named "observatory": ```text site:observatory:transaction_buffer site:observatory:failed_transactions site:observatory:transaction:abc-123-def site:observatory:cache:observation_summary:456 ``` This allows: - Multiple sites to share a Redis instance (testing) - Clear isolation between site data - Easy cleanup per site - Simple monitoring per site ## Accessing Configuration ### In Code ```python from ccat_ops_db_api.transaction_buffering import get_site_config # Get global site configuration site_config = get_site_config() # Check site properties if site_config.is_main_site: print("Running at main site") if site_config.is_secondary_site: print("Running at secondary site") # Get database URLs main_url = site_config.get_main_database_url() local_url = site_config.get_local_database_url() # Get Redis keys buffer_key = site_config.get_transaction_buffer_key() # Check buffering decision should_buffer = site_config.should_buffer_operation("critical") ``` ### Via API Endpoint ```bash curl http://localhost:8000/api/site/info ``` Response: ```json { "site_name": "observatory", "site_type": "secondary", "is_main_site": false, "is_secondary_site": true, "critical_operations_buffer": true, "transaction_buffer_size": 1000, "lsn_tracking_enabled": true } ``` ## Configuration Best Practices ### Development Configuration **Single database setup** (simplest): ```bash SITE_TYPE=main MAIN_DB_HOST=localhost LOCAL_DB_HOST=localhost REDIS_HOST=localhost ``` **Testing buffering** (simulate observatory): ```bash SITE_TYPE=secondary MAIN_DB_HOST=localhost # Pretend it's remote LOCAL_DB_HOST=localhost # Pretend it's replica REDIS_HOST=localhost CRITICAL_OPERATIONS_BUFFER=true ``` ### Production Configuration **Main site** (Cologne): ```bash SITE_TYPE=main MAIN_DB_HOST=localhost LOCAL_DB_HOST=localhost REDIS_HOST=redis.internal CRITICAL_OPERATIONS_BUFFER=false ``` **Secondary site** (Observatory): ```bash SITE_TYPE=secondary MAIN_DB_HOST=db.cologne.example.com LOCAL_DB_HOST=localhost REDIS_HOST=localhost CRITICAL_OPERATIONS_BUFFER=true LSN_TRACKING_ENABLED=true ``` ### Security Considerations **Sensitive values** (never commit): - Database passwords - SECRET_KEY - GITHUB_CLIENT_SECRET - API tokens **Use environment variables or secrets management**: ```bash # Load from secrets manager MAIN_DB_PASSWORD=$(aws secretsmanager get-secret-value --secret-id db-password --query SecretString --output text) ``` ## Configuration Validation ### Runtime Configuration Changes **Not supported**: Configuration changes require restart **Why**: - Site type affects fundamental behavior - Connection pools established at startup - Background processor started based on site type **To change**: Update configuration and restart the API ## Monitoring Configuration ### Configuration Metrics Expose configuration via metrics: ```python # Prometheus-style metrics site_type_info{site_name="observatory", site_type="secondary"} 1 buffering_enabled{site_name="observatory"} 1 lsn_tracking_enabled{site_name="observatory"} 1 ``` ### Configuration Endpoint The `/api/site/info` endpoint provides real-time configuration state: ```bash curl http://localhost:8000/api/site/info | jq . ``` Useful for: - Verifying deployment configuration - Debugging site-specific behavior - Monitoring dashboards - Integration tests ## Summary Site configuration: - **Determines behavior**: MAIN (direct) vs SECONDARY (buffered) - **Environment-driven**: Environment variables > .env > settings.toml - **Redis namespaced**: Keys prefixed with site name - **Validated at startup**: Fails fast if misconfigured - **Accessible at runtime**: Via code and API endpoint Key configuration parameters: - `SITE_TYPE`: "main" or "secondary" - `MAIN_DB_HOST`: Target for writes - `LOCAL_DB_HOST`: Target for reads - `CRITICAL_OPERATIONS_BUFFER`: Enable/disable buffering - `LSN_TRACKING_ENABLED`: Enable/disable replication tracking ## Next Steps - {doc}`authentication-system` - Authentication configuration - {doc}`../deep-dive/transaction-buffering/overview` - How buffering works - {doc}`../quickstart/running-locally` - Testing different configurations