Skip to content
Documentation GitHub
Workflow Issues

Wave-Based Sentinel Test Remediation Patterns

Wave-Based Sentinel Test Remediation Patterns

Problem

Large QA Sentinel test suites accumulate fixme and skip entries as features are added faster than UI test coverage can keep up. Without a systematic approach to remediation, fixme lists grow to 30+ items across 11 spec files, making it unclear which tests reflect real product gaps vs. infrastructure limitations.

INK-358 starting state: 275 active tests, 33 fixme tests across 11 spec files.

The 33 fixme tests fell into several failure categories:

  • Tests needing wiki-link backlinks couldn’t inject content without fighting Playwright’s CRDT boundary
  • Multi-workspace switching scenarios failed because BridgeState held hard-wired workspace repos
  • Trash UI was missing non-cascade delete and restore-ancestors paths
  • Fixme comments said “TODO” without specifying what was needed to unblock them
  • Some tests were genuinely untestable at the sentinel layer and needed proper disposition

Solution

Five-Wave Execution Strategy

The remediation ran in 5 waves with parallel agent teams:

Wave 1: Quick wins + bridge enhancement
├── Quick-wins agent: test-only fixes (no bridge changes)
└── Bridge agent: add save_block_content_markdown + update_references routes
Wave 2: Content-injection tests (depends on Wave 1 bridge routes)
├── Wiki-links agent: convert backlink/outgoing tests to API-driven approach
└── Page attribute agent: convert export test 16 to bridge API approach
Wave 3+5: UI improvements + final disposition (parallel with Wave 4)
└── Trash agent: add non-cascade delete + restore-ancestors prompt to UI
Wave 4: Re-wireable BridgeState for multi-workspace support
└── Bridge-state agent: WorkspaceContext struct + switch_workspace + accessor methods
Wave 5: Final disposition sweep
└── Fixme agent: update remaining fixme comments to BLOCKED/PERMANENT/COVERED format

Each wave ran agents in parallel with strict file ownership. The leader committed at wave boundaries.


Key Pattern 1: Bridge Route as Test Enabler

Problem

Tests for wiki-link backlinks needed to inject markdown with [[Display|slug]] syntax into page content. Playwright keyboard events bypass LoroSyncPlugin (synthetic transactions are not recorded by the CRDT), so typed content doesn’t persist across navigation. The test couldn’t construct Loro BLOB bytes manually from TypeScript.

Solution

Add a save_block_content_markdown bridge route that accepts raw markdown text, converts it server-side to a LoroDoc BLOB, saves via SaveBlockContentUseCase, and then runs UpdateReferencesUseCase to update the reference index.

apps/http-bridge/src/routes/page.rs
/// Save block content from a plain markdown string and update references.
///
/// Convenience route for QA tests: accepts raw markdown, converts it to a
/// LoroDoc BLOB, saves via `SaveBlockContentUseCase`, then runs reference
/// extraction. Eliminates the need for tests to construct Loro BLOBs manually.
pub async fn save_block_content_markdown(
Extension(state): Extension<Arc<BridgeState>>,
Json(args): Json<SaveBlockContentMarkdownArgs>,
) -> Result<Json<serde_json::Value>, CommandError> {
validate_slug_input(&args.slug, "slug")?;
validate_content_size(args.markdown.len())?;
state.require_workspace()?;
let guard = state.resolve_owner_guard()?;
// Convert markdown text to a LoroDoc BLOB
let (content_bytes, _version) = text_to_loro_bytes(&args.markdown);
// Save via the canonical use case (writes BLOB + text to SQLite)
let save_use_case = SaveBlockContentUseCase::new(state.page_repo());
save_use_case
.execute(&guard, &args.slug, &content_bytes, &args.markdown)
.map_err(|e| sanitize_error(e, "save_block_content_markdown"))?;
// Update reference index
let update_refs = UpdateReferencesUseCase::new(state.page_repo(), state.reference_repo());
if let Ok(page) = state.page_repository().get_by_slug(&args.slug)
&& let Err(e) = update_refs.execute(page.id)
{
tracing::warn!(error = %e, slug = %args.slug, "reference_index_update_failed");
}
Ok(Json(serde_json::json!(null)))
}

A separate update_references route was also added for tests that only need to trigger reference index synchronization without changing content.

Test usage

Tests call save_block_content_markdown via window.__TAURI_INTERNALS__.invoke() (which the bridge shim routes to the HTTP bridge) to inject wiki-link syntax before verifying backlink/outgoing-link APIs:

tests/e2e/tests/editor-wiki-links.spec.ts
// Inject wiki-link via bridge API — saves markdown to LoroDoc BLOB and
// updates the reference index, bypassing the CRDT keyboard-event limitation.
await appPage.evaluate(async ({ slug, markdown }) => {
await (window as any).__TAURI_INTERNALS__.invoke('save_block_content_markdown', {
slug,
markdown,
});
}, { slug: sourceSlug, markdown: `Check out [[${targetName}|${targetSlug}]]` });

This pattern unblocked 3 wiki-link tests (scenarios 9, 11) that required backlinks to be established before asserting the detail panel.

When to add a bridge convenience route

  • The operation exists at the application layer and has full use-case coverage
  • Tests need to inject state that only the backend can construct (CRDT BLOBs, reference indices)
  • Constructing the necessary input type in TypeScript would require pulling in large native libraries (loro-crdt npm)
  • The route is clearly test-only (document it as a /// Convenience route for QA tests: comment)

Key Pattern 2: Re-wireable Workspace Context

Problem

Multi-workspace tests (initialize_workspace, open_workspace, switch_workspace) required the bridge to target a different SQLite database after the workspace changed. The original BridgeState held all per-workspace repos as bare fields, wired at startup time — they could not be replaced at runtime.

Solution

Extract all per-workspace repos into a WorkspaceContext struct and hold it behind a parking_lot::Mutex. Add accessor methods that go through the mutex. switch_workspace() atomically replaces the context.

apps/http-bridge/src/state.rs
/// Hot-swappable workspace context. Rebuilt when the bridge switches workspaces.
pub struct WorkspaceContext {
#[allow(dead_code)]
pub db: Arc<WorkspaceDatabase>,
pub page_repository: Arc<SqlitePageRepository>,
pub reference_repository: SqliteReferenceRepository,
pub event_log_repository: Arc<SqliteEventLogRepository>,
pub embedding_repository: Arc<SqliteEmbeddingRepository>,
}
pub struct BridgeState {
pub current_workspace: Mutex<Option<Workspace>>,
/// Hot-swappable per-workspace repos. Protected by a mutex so
/// `initialize_workspace` and `open_workspace` can rewire them atomically.
pub workspace_context: Mutex<WorkspaceContext>,
// ... other fields remain direct (settings, analytics, auth are not per-workspace)
}
impl BridgeState {
/// Get a clone of the page repository.
pub fn page_repo(&self) -> SqlitePageRepository {
self.workspace_context.lock().page_repository.as_ref().clone()
}
/// Get an Arc to the page repository.
pub fn page_repository(&self) -> Arc<SqlitePageRepository> {
Arc::clone(&self.workspace_context.lock().page_repository)
}
/// Get a clone of the reference repository.
pub fn reference_repo(&self) -> SqliteReferenceRepository {
self.workspace_context.lock().reference_repository.clone()
}
/// Switch the active workspace — rebuilds all per-workspace repos atomically.
pub fn switch_workspace(
&self,
workspace: Workspace,
) -> Result<(), Box<dyn std::error::Error>> {
let new_ctx = WorkspaceContext::open(&workspace.path)?;
*self.workspace_context.lock() = new_ctx;
*self.current_workspace.lock() = Some(workspace);
// Invalidate the search router — it holds Arcs to the old repos.
*self.search_router.lock() = None;
Ok(())
}
}

All route handlers that previously accessed repos directly now go through the accessor methods:

// Before:
let page = state.page_repository.get_by_slug(&slug)?;
// After:
let page = state.page_repository().get_by_slug(&slug)?;

This unblocked 5 workspace-operations tests (scenarios 9, 11, 17, 25, 26) that required creating a temporary workspace, switching to it, operating, then restoring the original.

Design notes

  • The mutex lock is held only for the duration of the accessor call (lock → clone → unlock). No lock is held across await points.
  • parking_lot::Mutex is used (not tokio::sync::Mutex) because all bridge handlers are sync; the mutex prevents TOCTOU when two requests arrive simultaneously during a workspace switch.
  • The search router is invalidated on switch because it holds Arc references to the old repos and would otherwise serve search results from the wrong database.

Key Pattern 3: Consistent Fixme Disposition Format

Problem

Original fixme tests had comments like // TODO: figure out how to test this or no comment at all. Reviewers could not determine whether the fixme represented a missing feature, a tooling gap, or a known architectural limitation. Over time, fixmes became permanent without anyone tracking what would unblock them.

Solution

All fixme tests now follow a three-format classification system:

BLOCKED

Use when the test could be implemented but is missing a specific piece of infrastructure. The comment must state exactly what is missing and what would unblock it.

// BLOCKED: update_recent_workspace_icon is not exposed on the bridge and
// the icon picker UI depends on Tauri-native APIs (AppHandle for menu
// refresh). Requires: (1) bridge route for update_recent_workspace_icon,
// (2) basic icon picker UI in the bridge-compatible frontend. New feature.
test.fixme('16. Set a workspace icon', async () => {
// Needs: bridge route + icon picker UI component.
});

PERMANENT FIXME

Use when the test cannot work at the sentinel layer due to an architectural boundary, and the behavior is adequately covered at a lower layer. Must reference the specific lower-layer tests.

// BLOCKED: Wiki-link pill rendering requires LoroDoc with ProseMirror-compatible
// node structure. save_block_content_markdown stores plain text in the CRDT blob;
// the editor does not parse [[...]] markdown syntax into pills on load.
// Requires loro-crdt npm in sentinel tests OR a bridge route that constructs
// ProseMirror-compatible LoroDoc nodes.
//
// COVERED BY: Application layer unit tests in crates/application/src/page/
test.fixme('8. Ghost link resolves after target page is created', async ({ appPage }) => {
// Requires Tauri runtime: verifying cross-session persistence needs the app
// to shut down and restart, which is outside the scope of the bridge harness.
test.fixme('37. Recent list persists across sessions (settings persistence)', async () => {

COVERED BY

Use when there is an exact functional equivalent at a lower layer. Reference the specific test file and function name.

// COVERED BY: UpdateReferencesUseCase unit tests + rename integration tests
test.fixme('18. Rename page propagates updated display text to linking pages', ...

Classification decision tree

Is the feature implemented at the backend?
├── No → BLOCKED (describe what needs to be built)
├── Yes → Does the test work at the bridge/UI layer?
│ ├── Yes → Convert to active (not fixme)
│ └── No → Is the behavior covered at a lower layer?
│ ├── Yes → COVERED BY (reference the lower-layer test)
│ └── No → BLOCKED (describe what tooling/infra is missing)

Key Pattern 4: Try/Finally Workspace Restoration

Problem

Tests that switch the active workspace as part of their scenario contaminate all subsequent tests in the suite. If a test leaves the bridge pointing at a temporary workspace and that workspace is deleted in cleanup, the next test fails with a missing-database error.

Solution

All workspace-switching tests follow this pattern:

  1. Capture the original workspace identity before the test begins
  2. Perform the test operation (initialize, switch, open another workspace)
  3. Restore the original workspace in a finally block
  4. Delete temporary directories in the same finally block
tests/e2e/tests/workspace-operations.spec.ts
test('25. Switch between two workspaces', async ({ bridgePort }) => {
// Capture the original workspace so we can restore it afterwards.
const origRes = await fetch(`http://localhost:${bridgePort}/invoke/get_current_workspace`, {
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: '{}',
});
expect(origRes.ok).toBe(true);
const origWorkspace = await origRes.json();
const tmpA = fs.mkdtempSync(path.join(os.tmpdir(), 'inklings-ws-a-'));
const tmpB = fs.mkdtempSync(path.join(os.tmpdir(), 'inklings-ws-b-'));
try {
// ... test body: initialize A and B, switch between them ...
} finally {
// Restore original workspace so subsequent tests are not affected.
await fetch(`http://localhost:${bridgePort}/invoke/open_workspace`, {
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify({ path: origWorkspace.path }),
});
fs.rmSync(tmpA, { recursive: true, force: true });
fs.rmSync(tmpB, { recursive: true, force: true });
}
});

The finally block runs even if expect() throws, ensuring the bridge is always left in a known state regardless of test outcome.

Rules

  • Always use get_current_workspace at the start of the test to capture the original — do not hardcode the expected workspace path
  • Always restore with open_workspace before deleting the temp directory — if the deletion runs first and the bridge’s current workspace is inside the deleted directory, subsequent open_workspace calls will fail
  • Use fs.rmSync with { recursive: true, force: true } to clean up temp directories silently even if some files were not created

Key Pattern 5: lint-staged Failure Drops Staging

Problem

During the Wave 4 commit (bridge state re-wiring), a clippy error caused the pre-commit hook to abort. The lint-staged failure silently unstaged all previously staged files. The retry committed only the clippy-fixed file, not the full changeset.

Mechanism

lint-staged uses a stash-based isolation strategy:

  1. Before running linters: save unstaged changes via git stash
  2. Run linters against only the staged snapshot
  3. If linters fail: restore the stash — which replaces the entire index, unstaging all previously staged files
git add file1..file16 # 16 files staged
git commit # triggers lint-staged
├─ lint-staged: git stash # saves unstaged state
├─ lint-staged: run clippy # FAILS
└─ lint-staged: git stash pop # restores pre-commit state
# → ALL 16 files now UNSTAGED
git add fixedFile # only 1 file staged
git commit # succeeds with fewer files than intended

Solution

After any pre-commit hook failure, re-stage ALL intended files before retrying:

Terminal window
# After a pre-commit hook failure:
git status # Confirm files are unstaged
git add path/to/file1 path/to/file2 ... path/to/fix
git commit -m "..." # Retry with full staging

For agent teams using leader-commits mode, keep a running list of all files to stage so the full list can be re-applied after any failure.

See the full write-up at docs/solutions/workflow-issues/lint-staged-failure-drops-staged-files.md.


Results

INK-358 Remediation (5 Waves, ~10 Agent Instances)

MetricStartEndChange
Active tests275307+32 (+12%)
Fixme entries3318-15 (-45%)
Skip entries00No change
Merge conflicts0Zero conflicts
Bridge routes added2save_block_content_markdown, update_references
UI improvements2Non-cascade delete, restore-ancestors prompt

Fixme Disposition Summary

Of the original 33 fixme tests:

  • 15 converted to active: Via bridge route additions, re-wireable workspace context, or UI improvements
  • 12 re-dispositioned as BLOCKED: Now have explicit unblock criteria (missing bridge route, missing frontend component)
  • 6 re-dispositioned as COVERED BY/PERMANENT: Architectural boundary (CRDT pill rendering, session persistence) with lower-layer coverage references

Zero fixme tests remain with “TODO” or empty comments — all have actionable disposition.


Execution Checklist

Before starting sentinel remediation:

  • Run pnpm test:e2e and capture baseline counts (active, fixme, skip, flaky)
  • Categorize all fixme tests: bridge gap / UI gap / CRDT boundary / cross-session / truly untestable
  • Identify bridge routes that would unblock the most tests (highest yield first)
  • Identify any BridgeState structural limitations blocking multi-state tests
  • Plan wave ordering: infrastructure improvements before test conversions that depend on them

During execution:

  • Wave 1 (quick wins): test-only fixes with no infrastructure changes — zero blocking risk
  • Wave 1 (bridge): add convenience routes needed by subsequent waves
  • Wave 2 (content injection): convert tests that needed the new bridge routes
  • Wave 3+ (UI gaps): add missing UI components or improved interaction paths
  • Wave 4 (state gaps): any BridgeState structural changes needed for test scenarios
  • Final wave (disposition): update all remaining fixme comments to BLOCKED/COVERED format

After each wave:

  • Run pnpm test:e2e — verify active count increased and no regressions
  • Check git status before committing — verify full expected changeset is staged
  • After any pre-commit hook failure, re-stage ALL intended files before retrying

References

  • Branch: feat/ink-358-sentinel-conversion
  • Key commits:
    • efc83f8feat(bridge): add update_references + save_block_content_markdown routes
    • 2165e09feat(bridge): add re-wireable workspace context for multi-workspace support
    • 7a9ab60feat(qa): convert wiki-link backlink/outgoing tests to API-driven approach
    • 9e6d13dfeat(qa): unskip workspace operation tests 9, 11, 17, 25, 26
    • 6b84036docs(qa): update fixme dispositions with BLOCKED format and coverage refs
  • Bridge state: apps/http-bridge/src/state.rsWorkspaceContext and BridgeState
  • Bridge routes: apps/http-bridge/src/routes/page.rssave_block_content_markdown, update_references
  • Test files:
    • tests/e2e/tests/editor-wiki-links.spec.ts — API-driven backlink injection
    • tests/e2e/tests/workspace-operations.spec.ts — multi-workspace with try/finally
  • Related solutions:
    • docs/solutions/workflow-issues/lint-staged-failure-drops-staged-files.md
    • docs/solutions/patterns/partitioned-test-execution-backend-isolation.md
    • docs/solutions/workflow-issues/agent-team-orchestration-at-scale.md

Was this page helpful?