Skip to main content

Migration Patterns Reference

Expand-Contract Pattern

The fundamental pattern for zero-downtime schema changes. Every breaking change goes through three phases: expand (add new), migrate (backfill), contract (remove old).

Phase 1 — Expand

Add the new structure. The old structure still exists and the application still writes to it. Both old and new must be valid at this point.

-- Example: rename column users.full_name to users.display_name

-- Phase 1: add new column, make it nullable (no default required)
ALTER TABLE users ADD COLUMN display_name TEXT;

Phase 2 — Dual-Write (Application Deploy)

Deploy application code that writes to both old and new column. Reads still use the old column.

-- Backfill existing rows: chunk it to avoid long locks
DO $$
DECLARE
batch_size INT := 10000;
last_id BIGINT := 0;
max_id BIGINT;
BEGIN
SELECT MAX(id) INTO max_id FROM users;
WHILE last_id < max_id LOOP
UPDATE users
SET display_name = full_name
WHERE id > last_id AND id <= last_id + batch_size
AND display_name IS NULL;
last_id := last_id + batch_size;
PERFORM pg_sleep(0.01); -- brief pause to reduce lock contention
END LOOP;
END $$;

Phase 2b — Cutover (Application Deploy)

Deploy application code that reads from the new column. Still writes to both.

Phase 3 — Contract (Application Deploy + Migration)

Deploy application code that reads and writes only to the new column. Then drop the old.

-- Phase 3: safe to drop old column now
ALTER TABLE users DROP COLUMN full_name;

Backward-Compatible Changes (Safe, Single Migration)

These changes do not require expand-contract:

ChangeSafe?Notes
Add nullable columnYesNo default = no table rewrite
Add column with constant defaultYes (Postgres 11+)Postgres 11+ stores default in catalog, no rewrite
Add column with dynamic defaultNoCauses table rewrite
Add index (CONCURRENTLY)YesNon-blocking
Add constraint (NOT VALID)YesValidates existing rows later
Increase varchar lengthYesNo rewrite
Drop column (mark unused first)CarefulApp must stop referencing it
Rename columnNoBreaking — use expand-contract
Change column typeNoBreaking — use expand-contract
Drop NOT NULLYes
Add NOT NULLNoRequires backfill + constraint

Adding NOT NULL Without Downtime

Adding NOT NULL to an existing column requires all existing rows to have a non-null value. The naive ALTER TABLE ... SET NOT NULL locks the table while it validates every row.

-- Step 1: Add column as nullable, backfill, then add constraint as NOT VALID
ALTER TABLE orders ADD COLUMN confirmed_at TIMESTAMPTZ;

-- Step 2: backfill (chunked)
UPDATE orders SET confirmed_at = created_at WHERE confirmed_at IS NULL;

-- Step 3: Add NOT NULL constraint as NOT VALID (skips existing row validation)
ALTER TABLE orders ADD CONSTRAINT orders_confirmed_at_not_null
CHECK (confirmed_at IS NOT NULL) NOT VALID;

-- Step 4: Validate constraint in background (ShareUpdateExclusiveLock, non-blocking)
ALTER TABLE orders VALIDATE CONSTRAINT orders_confirmed_at_not_null;

-- Step 5: (Optional) Convert to true NOT NULL — only safe after validation
ALTER TABLE orders ALTER COLUMN confirmed_at SET NOT NULL;
ALTER TABLE orders DROP CONSTRAINT orders_confirmed_at_not_null;

Lock Timeout Settings

Always set a lock timeout before DDL statements in production migrations. Without it, a migration waiting for a lock can block all reads and writes on the table indefinitely.

-- Set per-session lock timeout before each DDL statement
SET lock_timeout = '5s';
SET statement_timeout = '30s';

-- The migration will fail fast if it can't acquire a lock in 5 seconds
-- Retry later when the blocking query completes
ALTER TABLE orders ADD COLUMN status TEXT;

Retry strategy: If a migration fails due to lock timeout, wait for the blocking query (visible in pg_stat_activity) to complete and retry. Do not increase the timeout to avoid blocking the table longer.


Index Migration Safety

-- Always use CONCURRENTLY for new indexes on live tables
-- CONCURRENTLY takes 2-3x longer but does not block reads or writes
CREATE INDEX CONCURRENTLY idx_orders_status ON orders (status);

-- If CONCURRENTLY fails partway through, the index is left as INVALID
-- Find invalid indexes:
SELECT indexname, tablename FROM pg_indexes
JOIN pg_class ON pg_class.relname = indexname
WHERE pg_class.relkind = 'i'
AND NOT EXISTS (
SELECT 1 FROM pg_index WHERE pg_index.indexrelid = pg_class.oid
AND pg_index.indisvalid
);

-- Drop the invalid index and recreate
DROP INDEX CONCURRENTLY idx_orders_status;
CREATE INDEX CONCURRENTLY idx_orders_status ON orders (status);

Migration File Organization

migrations/
├── 0001_create_users.sql
├── 0002_create_orders.sql
├── 0003_add_status_to_orders.sql # expand phase
├── 0004_backfill_order_status.sql # data migration
└── 0005_drop_old_status_column.sql # contract phase (deploy after app)

Convention: Separate structural changes (DDL) from data migrations (DML). Run structural migrations at deploy time. Run data migrations as background jobs or chunked scripts when the table is large.


Rollback Strategies

Not all migrations are trivially reversible. Plan rollback before executing forward.

Migration TypeRollback Strategy
Add columnALTER TABLE DROP COLUMN
Add indexDROP INDEX CONCURRENTLY
Add constraint NOT VALIDDROP CONSTRAINT
Rename column (expand phase)Drop new column — old still exists
Drop columnRestore from backup — plan before executing
Data backfillRe-run inverse transformation on affected rows

Rule: Never drop a column in the same deploy window as the application code change that stops using it. Wait one deploy cycle to confirm the old column is unused, then drop it in a follow-up migration. This gives you a clean rollback path if the deploy goes wrong.


Testing Migrations

# Test migration against a clone of production data
pg_dump $PROD_URL | psql $TEST_URL
psql $TEST_URL -f migrations/0042_new_change.sql

# Time the migration — if it takes > 1s, it may cause issues in production
time psql $TEST_URL -f migrations/0042_new_change.sql

# Check for lock contention by running under load
# Use pgbench to simulate concurrent queries while migration runs
pgbench -T 30 -c 50 $TEST_URL &
psql $TEST_URL -f migrations/0042_new_change.sql