Database Migrations#

This page explains how the ops-db schema evolves over time using Alembic, and how developers work with that tooling day-to-day. For the deployment side — how migrations run in staging and production — see Deploying Database Migrations.

Why Migrations Exist#

The ops-db schema is defined entirely in Python, through SQLAlchemy ORM model classes in ccat_ops_db/models.py. When a developer adds a column, creates a new table, or changes an enum, the live PostgreSQL database does not automatically follow — those changes exist only in Python code until something applies them to the database.

Alembic is that something. It maintains a versioned history of every schema change as numbered migration scripts (stored in alembic/versions/). Each script has an upgrade() function that applies the change and a downgrade() function that reverses it. Alembic records which revision the database is currently at inside the alembic_version table, so it always knows which scripts still need to run.

The causality chain is:

Developer edits models.py
       ↓
alembic revision --autogenerate   ← generates a migration script
       ↓
Developer reviews and commits the script with the model change (same PR)
       ↓
CI runs the full migration round-trip against a real PostgreSQL container
       ↓
Staging deploy: opsdb_migrate applies pending scripts before services restart
       ↓
Production deploy: same — opsdb_migrate runs before ccat update

Nothing lands in a live database that has not passed CI.

The Two Lifecycle Paths#

There are exactly two situations the database can be in, and each has a different tool:

Situation

Tool

What happens

Fresh database (first-ever install on a host)

opsdb_init

Runs Base.metadata.create_all() to build all tables in one shot, seeds reference data from the data-center repository, then stamps the Alembic version table at head so future calls to opsdb_migrate know no scripts need to run.

Existing database (every subsequent deploy)

opsdb_migrate

Runs alembic upgrade head, which applies only the migration scripts that have not yet been applied to that database. A no-op if the database is already at head.

Important

Never run opsdb_init --drop against a database that holds real data. opsdb_init is for bootstrapping a brand-new instance. The production safety guard in opsdb_init will refuse --drop when ENV_FOR_DYNACONF starts with "production", but the broader rule applies to any environment with live data.

The stamp that opsdb_init writes is the bridge between these two paths. Because the schema was created all at once by create_all(), there is nothing for Alembic to migrate — but Alembic needs to know that. The stamp records the current HEAD revision in alembic_version without running any migration scripts, so the next call to opsdb_migrate finds nothing pending and exits cleanly.

Developer Workflow#

Follow these steps whenever you change the schema (add/remove/rename columns or tables, change column types, add indexes or constraints, add or change enum values).

Step 1 — Edit the model#

Make your change in ccat_ops_db/models.py. Run the existing test suite to check nothing is broken at the ORM level:

pytest tests/

Step 2 — Generate the migration script#

Autogenerate compares the current SQLAlchemy metadata against the live database and emits the difference as a migration script. Use a real PostgreSQL database — autogenerate against SQLite misses PostgreSQL-specific type differences (UUID, JSON with astext_type, custom enum handling):

# Point Alembic at a clean PostgreSQL database
export ENV_FOR_DYNACONF=local          # or any postgresql profile
export CCAT_OPS_DB_DATABASE_POSTGRESQL_HOST=localhost
export CCAT_OPS_DB_DATABASE_POSTGRESQL_DATABASE=ccat_ops_db_dev

alembic revision --autogenerate -m "add column foo to bar"

The generated file lands in alembic/versions/.

Step 3 — Review the generated script#

Autogenerate is helpful but not perfect. Always read the script before committing. Check for:

  • Missing imports — autogenerate sometimes emits bare Text() inside postgresql.JSON(astext_type=Text()) without importing Text. Add from sqlalchemy import Text (or whatever is missing) at the top of the script.

  • PostgreSQL-specific types — columns using postgresql.UUID, postgresql.JSON, or similar should render with the full dialect type. If they appear as generic sa.String() or sa.VARCHAR(), adjust them manually.

  • Enum types in downgrade — PostgreSQL creates a server-side type for every Python enum.Enum mapped column. Alembic drops the column in downgrade but does not always drop the type. If your migration adds a new enum, add an explicit sa.Enum(name='myenum').drop(op.get_bind(), checkfirst=True) call at the end of downgrade().

  • Index and constraint names — check that autogenerated names are deterministic and match any existing names in the database.

  • Nullable / server_default — verify that new columns are nullable or have a server default if the table already contains rows, otherwise the migration will fail on a non-empty database.

Step 4 — Test the migration locally#

Run the full round-trip against your local PostgreSQL instance. This is the same sequence that CI runs:

alembic upgrade head     # apply all pending migrations
alembic downgrade -1     # roll back the newest one
alembic upgrade head     # re-apply — confirms both directions work

If either direction raises an error, fix the script before committing.

Check which revision the database is now at:

alembic current

Step 5 — Commit model change and migration together#

The model change and the migration script must land in the same commit (or at least the same PR). Separating them leaves the repository in a state where either the Python code and the migration are out of sync, causing confusion when reviewing the history.

git add ccat_ops_db/models.py alembic/versions/<new_revision>.py
git commit -m "feat: add column foo to bar table"

CI Validation#

The GitHub Actions workflow (ci_testing.yml) spins up a postgres:16 container as a service and runs the migration round-trip on every push:

alembic upgrade head    ← must apply cleanly to a fresh PostgreSQL DB
alembic downgrade -1    ← must reverse cleanly
alembic upgrade head    ← must re-apply cleanly

A PR cannot be merged if this step fails. This catches:

  • Import errors in the migration script

  • Type mismatches that only appear on PostgreSQL (not SQLite)

  • Downgrade functions that fail due to missing enum type cleanup

  • Constraint violations on upgrade (e.g. NOT NULL column added to a non-empty table without a default)

The CI database is ephemeral and always starts empty, so it tests the migration in isolation from any pre-existing state.

Inspecting Migration History#

Useful Alembic commands for understanding the current state:

# What revision is the database currently at?
alembic current

# Show all migration scripts in order
alembic history --verbose

# Show only the scripts not yet applied
alembic history -r current:head

These commands use the same dynaconf settings as everything else, so set ENV_FOR_DYNACONF appropriately before running them.