# Database Migrations This page explains how the ops-db schema evolves over time using [Alembic](https://alembic.sqlalchemy.org/), and how developers work with that tooling day-to-day. For the deployment side — how migrations run in staging and production — see {doc}`deployment_migrations`. ## Why Migrations Exist The ops-db schema is defined entirely in Python, through SQLAlchemy ORM model classes in `ccat_ops_db/models.py`. When a developer adds a column, creates a new table, or changes an enum, the live PostgreSQL database does not automatically follow — those changes exist only in Python code until something applies them to the database. Alembic is that something. It maintains a versioned history of every schema change as numbered migration scripts (stored in `alembic/versions/`). Each script has an `upgrade()` function that applies the change and a `downgrade()` function that reverses it. Alembic records which revision the database is currently at inside the `alembic_version` table, so it always knows which scripts still need to run. The causality chain is: ```text Developer edits models.py ↓ alembic revision --autogenerate ← generates a migration script ↓ Developer reviews and commits the script with the model change (same PR) ↓ CI runs the full migration round-trip against a real PostgreSQL container ↓ Staging deploy: opsdb_migrate applies pending scripts before services restart ↓ Production deploy: same — opsdb_migrate runs before ccat update ``` Nothing lands in a live database that has not passed CI. ## The Two Lifecycle Paths There are exactly two situations the database can be in, and each has a different tool: ```{eval-rst} .. list-table:: :widths: 25 25 50 :header-rows: 1 * - Situation - Tool - What happens * - **Fresh database** (first-ever install on a host) - ``opsdb_init`` - Runs ``Base.metadata.create_all()`` to build all tables in one shot, seeds reference data from the data-center repository, then stamps the Alembic version table at ``head`` so future calls to ``opsdb_migrate`` know no scripts need to run. * - **Existing database** (every subsequent deploy) - ``opsdb_migrate`` - Runs ``alembic upgrade head``, which applies only the migration scripts that have not yet been applied to that database. A no-op if the database is already at head. ``` :::{important} **Never run** `opsdb_init --drop` against a database that holds real data. `opsdb_init` is for bootstrapping a brand-new instance. The production safety guard in `opsdb_init` will refuse `--drop` when `ENV_FOR_DYNACONF` starts with `"production"`, but the broader rule applies to any environment with live data. ::: The stamp that `opsdb_init` writes is the bridge between these two paths. Because the schema was created all at once by `create_all()`, there is nothing for Alembic to migrate — but Alembic needs to know that. The stamp records the current HEAD revision in `alembic_version` without running any migration scripts, so the next call to `opsdb_migrate` finds nothing pending and exits cleanly. ## Developer Workflow Follow these steps whenever you change the schema (add/remove/rename columns or tables, change column types, add indexes or constraints, add or change enum values). ### Step 1 — Edit the model Make your change in `ccat_ops_db/models.py`. Run the existing test suite to check nothing is broken at the ORM level: ```bash pytest tests/ ``` ### Step 2 — Generate the migration script Autogenerate compares the current SQLAlchemy metadata against the live database and emits the difference as a migration script. Use a **real PostgreSQL database** — autogenerate against SQLite misses PostgreSQL-specific type differences (`UUID`, `JSON` with `astext_type`, custom enum handling): ```bash # Point Alembic at a clean PostgreSQL database export ENV_FOR_DYNACONF=local # or any postgresql profile export CCAT_OPS_DB_DATABASE_POSTGRESQL_HOST=localhost export CCAT_OPS_DB_DATABASE_POSTGRESQL_DATABASE=ccat_ops_db_dev alembic revision --autogenerate -m "add column foo to bar" ``` The generated file lands in `alembic/versions/`. ### Step 3 — Review the generated script Autogenerate is helpful but not perfect. Always read the script before committing. Check for: - **Missing imports** — autogenerate sometimes emits bare `Text()` inside `postgresql.JSON(astext_type=Text())` without importing `Text`. Add `from sqlalchemy import Text` (or whatever is missing) at the top of the script. - **PostgreSQL-specific types** — columns using `postgresql.UUID`, `postgresql.JSON`, or similar should render with the full dialect type. If they appear as generic `sa.String()` or `sa.VARCHAR()`, adjust them manually. - **Enum types in downgrade** — PostgreSQL creates a server-side type for every Python `enum.Enum` mapped column. Alembic drops the *column* in downgrade but does not always drop the *type*. If your migration adds a new enum, add an explicit `sa.Enum(name='myenum').drop(op.get_bind(), checkfirst=True)` call at the end of `downgrade()`. - **Index and constraint names** — check that autogenerated names are deterministic and match any existing names in the database. - **Nullable / server_default** — verify that new columns are nullable or have a server default if the table already contains rows, otherwise the migration will fail on a non-empty database. ### Step 4 — Test the migration locally Run the full round-trip against your local PostgreSQL instance. This is the same sequence that CI runs: ```bash alembic upgrade head # apply all pending migrations alembic downgrade -1 # roll back the newest one alembic upgrade head # re-apply — confirms both directions work ``` If either direction raises an error, fix the script before committing. Check which revision the database is now at: ```bash alembic current ``` ### Step 5 — Commit model change and migration together The model change and the migration script must land in the **same commit** (or at least the same PR). Separating them leaves the repository in a state where either the Python code and the migration are out of sync, causing confusion when reviewing the history. ```bash git add ccat_ops_db/models.py alembic/versions/.py git commit -m "feat: add column foo to bar table" ``` ## CI Validation The GitHub Actions workflow (`ci_testing.yml`) spins up a `postgres:16` container as a service and runs the migration round-trip on every push: ```text alembic upgrade head ← must apply cleanly to a fresh PostgreSQL DB alembic downgrade -1 ← must reverse cleanly alembic upgrade head ← must re-apply cleanly ``` A PR cannot be merged if this step fails. This catches: - Import errors in the migration script - Type mismatches that only appear on PostgreSQL (not SQLite) - Downgrade functions that fail due to missing enum type cleanup - Constraint violations on upgrade (e.g. NOT NULL column added to a non-empty table without a default) The CI database is ephemeral and always starts empty, so it tests the migration in isolation from any pre-existing state. ## Inspecting Migration History Useful Alembic commands for understanding the current state: ```bash # What revision is the database currently at? alembic current # Show all migration scripts in order alembic history --verbose # Show only the scripts not yet applied alembic history -r current:head ``` These commands use the same dynaconf settings as everything else, so set `ENV_FOR_DYNACONF` appropriately before running them. ## Related Documentation - {doc}`deployment_migrations` — how migrations run in staging and production deployments - {doc}`../config/index` — configuring database connections and environment selection - {doc}`../api_reference/utilities` — `opsdb_init` reference - [Alembic documentation](https://alembic.sqlalchemy.org/en/latest/)