# Authentication System The ops-db-api supports two authentication methods through a unified interface: GitHub OAuth (for UI users) and API tokens (for service automation). ```{contents} Table of Contents :depth: 2 :local: true ``` ## Overview **Two Token Types, One Interface**: 1. **GitHub OAuth + JWT**: For human users accessing the web UI 2. **API Tokens**: For service scripts and automation Both use the same `Authorization: Bearer TOKEN` header format, making them interchangeable from the client's perspective. ```{eval-rst} .. mermaid:: graph TB Client[Client Request] Auth[Unified Authentication] JWT[JWT Validator] APIToken[API Token Validator] User[(User Database)] Client -->|Authorization: Bearer TOKEN| Auth Auth --> JWT Auth --> APIToken JWT --> User APIToken --> User JWT -->|Valid| Success[Authenticated User] APIToken -->|Valid| Success style Auth fill:#90EE90 style Success fill:#87CEEB ``` ## Authentication Flow ### Unified Token Validation The `get_current_user()` dependency handles both token types: ```{literalinclude} ../../ccat_ops_db_api/auth/unified_auth.py :emphasize-lines: 10-15, 25-30 :language: python :lines: 70-130 ``` ### Request Header Format Both authentication methods use the same header: ```http GET /api/transfer/overview HTTP/1.1 Host: api.example.com Authorization: Bearer eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9... ``` Or: ```http POST /executed_obs_units/start HTTP/1.1 Host: api.example.com Authorization: Bearer ops_api_token_abc123xyz789... ``` ### Token Type Detection The system automatically detects token type: ```python async def determine_token_type(token: str) -> str: if token.startswith("ops_api_token_"): return "api_token" else: # Assume JWT (can also check JWT structure) return "jwt" ``` ## GitHub OAuth + JWT ### OAuth Flow ```{eval-rst} .. mermaid:: sequenceDiagram participant User participant Frontend participant API participant GitHub User->>Frontend: Click "Login with GitHub" Frontend->>API: GET /github/login API->>GitHub: Redirect to OAuth GitHub->>User: Authorization page User->>GitHub: Approve GitHub->>API: Callback with code API->>GitHub: Exchange code for access token GitHub-->>API: Access token API->>GitHub: Get user info GitHub-->>API: User profile API->>API: Create or update user API->>API: Generate JWT API->>Frontend: Redirect with JWT Frontend->>Frontend: Store JWT Frontend->>API: Subsequent requests with JWT ``` ### OAuth Configuration Required environment variables: ```bash GITHUB_CLIENT_ID=your_github_oauth_app_client_id GITHUB_CLIENT_SECRET=your_github_oauth_app_secret SECRET_KEY=your_jwt_signing_key ``` ### JWT Token Structure JWT payload contains minimal information (user details fetched from database): ```json { "sub": "scientist_alice", "exp": 1735689600, "iat": 1735603200 } ``` The `sub` field contains the username, which is used to look up the full user object (including roles and permissions) from the database during token verification. ### JWT Generation ```python from jose import jwt from datetime import datetime, timedelta def create_jwt_token(user: User) -> str: payload = { "sub": user.username, # Username in subject "exp": datetime.utcnow() + timedelta(minutes=30), # 30 minute expiration "iat": datetime.utcnow() } return jwt.encode(payload, SECRET_KEY, algorithm="HS256") ``` ### JWT Verification ```python from jose import jwt, JWTError def verify_jwt_token(token: str, db: Session) -> Optional[User]: try: payload = jwt.decode(token, SECRET_KEY, algorithms=["HS256"]) username = payload.get("sub") # Username in subject if username is None: return None user = db.query(User).filter(User.username == username).first() return user except JWTError: return None ``` ### Token Expiration - **Default expiration**: 30 minutes - **Refresh mechanism**: Re-login through GitHub OAuth - **No refresh tokens**: Simplified security model - **CSRF protection**: State verification enabled in OAuth callback ## API Tokens ### Token Generation TBD this has to be updated when the authentication system is completely implemented. API tokens are generated with: ```{literalinclude} ../../ccat_ops_db_api/auth/unified_auth.py :language: python :pyobject: generate_api_token ``` **Important**: The raw token is shown once; only the hash is stored. ### Token Storage Database schema: ```{literalinclude} ../../../ops-db/ccat_ops_db/models.py :language: python :pyobject: ApiToken ``` ### Token Verification ```{literalinclude} ../../ccat_ops_db_api/auth/unified_auth.py :language: python :pyobject: verify_api_token ``` ### Usage Tracking API tokens track: - **Last used**: Timestamp of most recent use - **Usage count**: Total number of requests - **IP address**: (Optional) Last request IP - **User agent**: (Optional) Last request client This helps identify: - Unused tokens (can be revoked) - Suspicious activity - Service health monitoring ### Development Tokens For local development, the system automatically creates deterministic development tokens that can be reused across database resets. These tokens are **only valid in development environments** and are automatically rejected in production. ### Automatic Seeding Development tokens are automatically created when: 1. **Database initialization**: When running `opsdb_init` with `data_archive_mode="development"` 2. **API startup** (fallback): When the API starts with `ENVIRONMENT=development` (if tokens weren't seeded during init) ### Token Format Development tokens are clearly identifiable by their prefix: ```text ops_api_token_dev_ ``` This prefix ensures they can be easily identified and blocked in production environments. ### Deterministic Generation Development tokens are generated deterministically using HMAC-SHA256: ```python token = hmac_sha256(service_name + DEV_TOKEN_SECRET) full_token = f"ops_api_token_dev_{base64_encode(token)}" ``` This means: \- Same `DEV_TOKEN_SECRET` + same service name = same token \- Tokens are reusable across database resets \- Tokens can be documented and shared within the development team ### Default Development Tokens Two development tokens are created by default: 1. **service_dev-pipeline**: Service account with scopes: \- `read:observations` \- `write:observations` \- `read:data` \- `write:data` 2. **service_dev-cli**: Full access for CLI tools: \- `read:*` \- `write:*` ### Environment Configuration Set the `DEV_TOKEN_SECRET` environment variable to customize token generation: ```bash export DEV_TOKEN_SECRET="your-dev-secret-key" ``` If not set, a default secret is used (with a warning). ### Production Safety Development tokens are **automatically rejected** in non-development environments: - Tokens starting with `ops_api_token_dev_` are checked - Environment must be explicitly set to development/dev/local - Attempts to use dev tokens in production are logged as security warnings - Returns `401 Unauthorized` if dev token used in production ### Usage Example After database initialization, tokens are printed to the console: ```text ================================================================================ DEVELOPMENT TOKENS CREATED ================================================================================ Save these tokens in your development environment: # Development API Tokens export DEV_PIPELINE_TOKEN="ops_api_token_dev_..." export DEV_CLI_TOKEN="ops_api_token_dev_..." ⚠️ These tokens are ONLY valid in development mode! ================================================================================ ``` Use in development scripts: ```python import os import requests token = os.getenv("DEV_PIPELINE_TOKEN") headers = {"Authorization": f"Bearer {token}"} response = requests.get( "http://localhost:8000/api/observations", headers=headers ) ``` ### Token Management The API provides comprehensive token management endpoints under `/api/tokens/`: **Create token** (token shown only once): ```bash curl -X POST http://localhost:8000/api/tokens/ \ -H "Authorization: Bearer YOUR_JWT" \ -H "Content-Type: application/json" \ -d '{ "name": "Observatory Automation", "scopes": ["read:observations", "write:data"], "expires_in_days": 365 }' ``` Response includes full token (shown only once): ```json { "token": "ops_api_token_abc123xyz789...", "token_info": { "id": 42, "name": "Observatory Automation", "token_prefix": "abc12345", "scopes": ["read:observations", "write:data"], "expires_at": "2026-01-01T00:00:00Z", "active": true, "usage_count": 0 } } ``` **Available endpoints**: - `GET /api/tokens/scopes` - Get available scopes - `POST /api/tokens/` - Create token - `GET /api/tokens/` - List all tokens - `GET /api/tokens/{id}` - Get token details - `PUT /api/tokens/{id}` - Update token - `GET /api/tokens/{id}/usage` - Get usage statistics - `POST /api/tokens/{id}/regenerate` - Regenerate token - `DELETE /api/tokens/{id}` - Revoke token - `DELETE /api/tokens/{id}/permanent` - Permanently delete - `POST /api/tokens/bulk-revoke` - Bulk revoke - `GET /api/tokens/export` - Export token list See {doc}`../../AuthToken` for complete endpoint documentation. ## Role-Based Access Control (RBAC) ### Default Roles ```{eval-rst} .. list-table:: :header-rows: 1 :widths: 15 50 35 * - Role - Permissions - Typical Users * - **admin** - Full access, user management, system configuration - System administrators * - **observer** - Create/update observations, register data files - Observatory operators, automation * - **viewer** - Read-only access to all data - Scientists, collaborators * - **service** - Automated operations, no UI access - Background services, scripts ``` ### Permission Model Permissions are hierarchical: ```text read:observations write:observations delete:observations manage:users configure:system ``` ### Decorators for Authorization **Require specific roles**: ```python from ccat_ops_db_api.auth import require_roles @router.post("/admin/users") @require_roles("admin") async def create_user( user_data: UserCreate, current_user: User = Depends(get_current_user) ): # Only admins can create users ... ``` **Require specific permissions** (enforces token scopes for API tokens): ```python from ccat_ops_db_api.auth import require_permissions @router.get("/observations") @require_permissions("read:observations") async def get_observations( current_user: User = Depends(get_current_user) ): # For API tokens: checks token scopes # For JWT tokens: checks role permissions ... ``` **Service account only** (rejects JWT tokens): ```python from ccat_ops_db_api.auth import get_service_user, require_service_token @router.post("/executed_obs_units/start") @require_service_token async def start_observation( obs_data: ExecutedObsUnitCreate, current_user: User = Depends(get_service_user) ): # Only accepts API tokens from service accounts # JWT tokens will raise AuthenticationError ... ``` **Multiple roles or permissions**: ```python @require_roles("admin", "observer") # OR logic async def protected_endpoint(...): ... @require_permissions("read:observations", "read:sources") # AND logic async def complex_query(...): ... ``` ### Helper Functions ```python from ccat_ops_db_api.auth import has_role, has_permission # Check role if has_role(current_user, "admin"): # Show admin options pass # Check permission if has_permission(current_user, "delete:observations"): # Allow deletion pass ``` ### Database Schema ```sql CREATE TABLE role ( id SERIAL PRIMARY KEY, name VARCHAR(50) UNIQUE, description TEXT ); CREATE TABLE permission ( id SERIAL PRIMARY KEY, name VARCHAR(100) UNIQUE, description TEXT ); CREATE TABLE user_role ( user_id INTEGER REFERENCES "user"(id), role_id INTEGER REFERENCES role(id), PRIMARY KEY (user_id, role_id) ); CREATE TABLE role_permission ( role_id INTEGER REFERENCES role(id), permission_id INTEGER REFERENCES permission(id), PRIMARY KEY (role_id, permission_id) ); ``` ## Authentication vs Authorization **Authentication**: Who are you? - JWT or API token proves identity - Returns `User` object - `401 Unauthorized` if fails **Authorization**: What can you do? - Roles and permissions determine access - Checked after authentication - `403 Forbidden` if insufficient permissions ## Error Responses ### 401 Unauthorized Missing or invalid token: ```json { "detail": "Could not validate credentials" } ``` ### 403 Forbidden Valid token but insufficient permissions or scopes: ```json { "detail": "Insufficient permissions. Required roles: admin" } ``` Or for API tokens with missing scopes: ```json { "detail": "Token missing required scopes: write:data. Token has scopes: read:observations" } ``` ## Token Usage Examples ### Using JWT (UI User) ```python import requests # After GitHub OAuth login, frontend receives JWT jwt_token = "eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9..." headers = {"Authorization": f"Bearer {jwt_token}"} # Make authenticated request response = requests.get( "http://api.example.com/api/transfer/overview", headers=headers ) ``` ### Using API Token (Service Script) ```python import requests # API token for observatory automation api_token = "ops_api_token_abc123xyz789..." headers = {"Authorization": f"Bearer {api_token}"} # Record observation response = requests.post( "http://api.example.com/executed_obs_units/start", headers=headers, json={ "obs_unit_id": 123, "start_time": "2025-01-01T00:00:00Z", # ... } ) ``` ## Security Best Practices ### For JWT Tokens - Use HTTPS in production - Short expiration (30 minutes) - Secure SECRET_KEY (32+ random bytes) - Don't store in localStorage (XSS risk) - use httpOnly cookies - CSRF protection enabled via state token verification ### For API Tokens - Generate with cryptographic randomness (`secrets` module) - Store only hashed versions (SHA-256) - Require HTTPS for transmission - Set expiration dates - Monitor usage and revoke unused tokens - Rotate tokens periodically ## Summary The authentication system provides: - **Unified interface**: Same header format for both token types - **Dual authentication**: GitHub OAuth (users) + API tokens (services) - **RBAC**: Role and permission-based authorization - **Scope enforcement**: Fine-grained permissions for API tokens - **Service account isolation**: Service-only endpoints reject JWT tokens - **Usage tracking**: Monitor API token usage (count, IP, timestamps) - **Security**: Hashed storage, expiration, HTTPS enforcement, CSRF protection Token comparison: ```{eval-rst} .. list-table:: :header-rows: 1 :widths: 20 40 40 * - Feature - GitHub OAuth + JWT - API Tokens * - **Use case** - Interactive web users - Automation and services * - **Lifetime** - 30 minutes (re-login) - Configurable (1-365 days personal, up to 3 years service) * - **Scopes** - Role-based permissions - Fine-grained scopes (enforced) * - **Revocation** - Re-login required - Instant via API/database * - **Usage tracking** - No - Yes (last used, count) * - **Storage** - Frontend (memory/cookies) - Scripts (env vars/config) ``` ## Next Steps - {doc}`../deep-dive/authentication/unified-auth` - Implementation details - {doc}`../deep-dive/authentication/github-oauth` - OAuth flow deep dive - {doc}`../deep-dive/authentication/api-tokens` - Token management details - {doc}`../../AuthToken` - Complete token management API reference - {doc}`../tutorials/simple-endpoints/adding-authentication` - Tutorial for securing endpoints