Skip to content
Documentation GitHub
Database Issues

SQLite Performance Query Silently Zeroes Fields

SQLite Performance Query Silently Zeroes Fields

Problem

An integration test (test_merge_tags) failed with usage_count == 0 when 2 was expected, despite the tag being correctly assigned to 2 pages.

Symptoms:

  • tag_repo.get_tag_by_id(id).usage_count always returns 0
  • tag_repo.list_tags() returns the correct usage_count for the same tag
  • No error, no warning — the field is silently zero

Investigation

Steps Tried

  1. Checked SetPageTagsUseCase — correctly inserts into page_tags table
  2. Checked page_tags table contents — both assignments present
  3. Compared SQL queries between get_tag_by_id and list_tags — found the difference

Root Cause

Two SQL query constants serve different purposes:

-- TAG_SELECT_SIMPLE (used by get_tag_by_id, get_tag_by_name)
-- Returns 0 AS usage_count to avoid LEFT JOIN overhead
SELECT t.id, t.name, t.slug, t.color, t.group_id,
0 AS usage_count, -- ← hardcoded zero!
t.created_at, t.updated_at
FROM tags t
-- TAG_SELECT (used by list_tags, search_tags)
-- Computes real count via LEFT JOIN
SELECT t.id, t.name, t.slug, t.color, t.group_id,
COUNT(pt.tag_id) AS usage_count,
t.created_at, t.updated_at
FROM tags t
LEFT JOIN page_tags pt ON pt.tag_id = t.id
GROUP BY t.id

The performance optimization is documented in the code comment:

“Used by single-entity lookups that only need tag metadata. Avoids the LEFT JOIN overhead. Returns usage_count as 0 — callers that need real counts should use TAG_SELECT.”

The test used the wrong accessor method.

Solution

// Before (broken) — get_tag_by_id uses TAG_SELECT_SIMPLE, always returns 0
let source_before = tag_repo.get_tag_by_id(source.id).unwrap();
assert_eq!(source_before.usage_count, 2); // FAILS: left=0
// After (fixed) — list_tags uses TAG_SELECT with real LEFT JOIN count
let all_tags = tag_repo.list_tags().unwrap();
let source_before = all_tags.iter().find(|t| t.id == source.id).unwrap();
assert_eq!(source_before.usage_count, 2); // PASSES

Prevention

Best Practices

  • Never assert on fields from “simple” queries — if a query skips expensive joins for performance, its denormalized fields cannot be trusted for correctness assertions
  • Document zero-valued fields at the type level — when a struct field may be intentionally zero depending on how it was loaded, consider using Option<u32> instead of u32 to make the ambiguity visible
  • Name query variants explicitlyTAG_SELECT_SIMPLE vs TAG_SELECT is good, but methods like get_tag_by_id_with_counts() would make the caller’s choice explicit

Warning Signs

  • A struct field that is loaded from different SQL queries with different semantics
  • Performance-optimized queries that hardcode 0 AS field_name or NULL AS field_name
  • Tests that pass in unit tests (with mocks that return real values) but fail in integration tests (with real SQL)

Pattern: Two-Query Accessor

When a repository has both “fast” and “full” queries, consider:

  1. Separate return types: TagSummary (no count) vs TagWithStats (with count)
  2. Or Option<u32> for fields that may not be populated
  3. At minimum, document which methods return which fields

References

  • Commit: ee03e57e — fix(tests): use list_tags for real usage_count in test_merge_tags
  • File: crates/infrastructure/sqlite/src/workspace/tag_repository.rs (TAG_SELECT_SIMPLE vs TAG_SELECT)
  • File: tests/core/tests/tag_tests.rs:289

Was this page helpful?