Database Issues
SQLite Post-Migration Rust Backfill for Application-Generated Values
SQLite Post-Migration Rust Backfill for Application-Generated Values
Problem
Schema V002 adds ref_code TEXT NOT NULL DEFAULT '' columns to four tables. The ref_code values must be 11-char base62
nanoids generated by Rust’s nanoid crate. Pure SQL cannot generate these — hex(randomblob(N)) produces hex strings
(wrong alphabet, wrong length), and SQLite has no built-in nanoid function.
Symptoms:
hex(randomblob(8))placeholder UPDATE in migration produces 16-char hex strings, not 11-char base62- Unique constraint on
ref_codewould reject duplicate placeholders if multiple rows exist - Format mismatch between migration-generated and application-generated ref_codes
Root Cause
Application-layer identifier formats (nanoid with custom alphabet) cannot be replicated in pure SQL. The migration schema DDL and the data backfill have different capability requirements: DDL is SQL-native, but value generation needs the application runtime.
Solution
Split the migration into two phases:
Phase 1: Schema DDL (pure SQL, in migration file)
-- Add column with empty default (no constraint violation on existing rows)ALTER TABLE pages ADD COLUMN ref_code TEXT NOT NULL DEFAULT '';-- Partial unique index ignores empty strings during backfill windowCREATE UNIQUE INDEX idx_pages_ref_code ON pages(ref_code) WHERE ref_code != '';Phase 2: Rust post-migration hook (after migrations.to_latest())
fn run_migrations(conn: &Connection) -> MigrationResult<()> { let migrations = Migrations::new(vec![/* ... */]); migrations.to_latest(conn)?;
// Post-migration backfill: generate proper ref_codes for empty rows backfill_ref_codes(conn)?; Ok(())}
fn backfill_ref_codes(conn: &Connection) -> MigrationResult<()> { for table in &["pages", "blocks", "bookmarks", "attachments"] { backfill_ref_codes_for_table(conn, table)?; } Ok(())}
fn backfill_ref_codes_for_table(conn: &Connection, table: &str) -> MigrationResult<()> { let sql = format!("SELECT id FROM {} WHERE ref_code = ''", table); let mut stmt = conn.prepare(&sql)?; let ids: Vec<String> = stmt .query_map([], |row| row.get::<_, String>(0))? .collect::<Result<Vec<_>, _>>()?;
if ids.is_empty() { return Ok(()); }
let update_sql = format!("UPDATE {} SET ref_code = ?1 WHERE id = ?2", table); for id in &ids { let ref_code = RefCode::generate().to_string(); conn.execute(&update_sql, params![ref_code, id])?; } Ok(())}Implementation Notes
- The backfill is idempotent: only updates rows where
ref_code = '' - Runs on every app startup (inside
run_migrations), but no-ops after first backfill - The partial unique index (
WHERE ref_code != '') prevents the empty default from conflicting across rows while still enforcing uniqueness for real values - Each ref_code is generated individually to ensure uniqueness (collision probability negligible at 65.5 bits entropy per value)
Prevention
Best Practices
- If SQL can’t generate the value, don’t fake it in SQL. Use a post-migration hook in the application language.
- Use
DEFAULT ''+ partial unique index as a two-phase pattern: DDL creates the column safely, application backfill populates it correctly. - Keep the backfill idempotent with a
WHERE column = ''orWHERE column IS NULLguard so it’s safe to run on every startup.
When to Use This Pattern
- Custom identifier formats (nanoid, ULID, KSUID with specific alphabets)
- Values derived from application configuration (not available in SQL context)
- CRDT snapshots or binary blobs that need application-layer serialization
- Any value where the generation algorithm lives in application code
When NOT to Use This Pattern
- Values SQL can generate natively (UUIDs via
lower(hex(randomblob(16))), timestamps, sequential IDs) - Simple defaults (booleans, empty strings, zero values)
References
crates/infrastructure/sqlite/src/migrations/mod.rs— backfill implementationcrates/domain/src/identifiers.rs— RefCode::generate()docs/solutions/database-issues/sql-migration-column-evolution-audit.md— related column audit pattern
Previous
SQLite Performance Query Silently Zeroes Fields Next
SQLite Queue Full-Table Scan Anti-Pattern
SQLite Performance Query Silently Zeroes Fields Next
SQLite Queue Full-Table Scan Anti-Pattern
Was this page helpful?
Thanks for your feedback!