Wave-Based Sentinel Test Remediation Patterns
Wave-Based Sentinel Test Remediation Patterns
Problem
Large QA Sentinel test suites accumulate fixme and skip entries as features are added faster than UI test coverage
can keep up. Without a systematic approach to remediation, fixme lists grow to 30+ items across 11 spec files, making it
unclear which tests reflect real product gaps vs. infrastructure limitations.
INK-358 starting state: 275 active tests, 33 fixme tests across 11 spec files.
The 33 fixme tests fell into several failure categories:
- Tests needing wiki-link backlinks couldn’t inject content without fighting Playwright’s CRDT boundary
- Multi-workspace switching scenarios failed because
BridgeStateheld hard-wired workspace repos - Trash UI was missing non-cascade delete and restore-ancestors paths
- Fixme comments said “TODO” without specifying what was needed to unblock them
- Some tests were genuinely untestable at the sentinel layer and needed proper disposition
Solution
Five-Wave Execution Strategy
The remediation ran in 5 waves with parallel agent teams:
Wave 1: Quick wins + bridge enhancement ├── Quick-wins agent: test-only fixes (no bridge changes) └── Bridge agent: add save_block_content_markdown + update_references routes
Wave 2: Content-injection tests (depends on Wave 1 bridge routes) ├── Wiki-links agent: convert backlink/outgoing tests to API-driven approach └── Page attribute agent: convert export test 16 to bridge API approach
Wave 3+5: UI improvements + final disposition (parallel with Wave 4) └── Trash agent: add non-cascade delete + restore-ancestors prompt to UI
Wave 4: Re-wireable BridgeState for multi-workspace support └── Bridge-state agent: WorkspaceContext struct + switch_workspace + accessor methods
Wave 5: Final disposition sweep └── Fixme agent: update remaining fixme comments to BLOCKED/PERMANENT/COVERED formatEach wave ran agents in parallel with strict file ownership. The leader committed at wave boundaries.
Key Pattern 1: Bridge Route as Test Enabler
Problem
Tests for wiki-link backlinks needed to inject markdown with [[Display|slug]] syntax into page content. Playwright
keyboard events bypass LoroSyncPlugin (synthetic transactions are not recorded by the CRDT), so typed content doesn’t
persist across navigation. The test couldn’t construct Loro BLOB bytes manually from TypeScript.
Solution
Add a save_block_content_markdown bridge route that accepts raw markdown text, converts it server-side to a LoroDoc
BLOB, saves via SaveBlockContentUseCase, and then runs UpdateReferencesUseCase to update the reference index.
/// Save block content from a plain markdown string and update references.////// Convenience route for QA tests: accepts raw markdown, converts it to a/// LoroDoc BLOB, saves via `SaveBlockContentUseCase`, then runs reference/// extraction. Eliminates the need for tests to construct Loro BLOBs manually.pub async fn save_block_content_markdown( Extension(state): Extension<Arc<BridgeState>>, Json(args): Json<SaveBlockContentMarkdownArgs>,) -> Result<Json<serde_json::Value>, CommandError> { validate_slug_input(&args.slug, "slug")?; validate_content_size(args.markdown.len())?;
state.require_workspace()?; let guard = state.resolve_owner_guard()?;
// Convert markdown text to a LoroDoc BLOB let (content_bytes, _version) = text_to_loro_bytes(&args.markdown);
// Save via the canonical use case (writes BLOB + text to SQLite) let save_use_case = SaveBlockContentUseCase::new(state.page_repo()); save_use_case .execute(&guard, &args.slug, &content_bytes, &args.markdown) .map_err(|e| sanitize_error(e, "save_block_content_markdown"))?;
// Update reference index let update_refs = UpdateReferencesUseCase::new(state.page_repo(), state.reference_repo()); if let Ok(page) = state.page_repository().get_by_slug(&args.slug) && let Err(e) = update_refs.execute(page.id) { tracing::warn!(error = %e, slug = %args.slug, "reference_index_update_failed"); }
Ok(Json(serde_json::json!(null)))}A separate update_references route was also added for tests that only need to trigger reference index synchronization
without changing content.
Test usage
Tests call save_block_content_markdown via window.__TAURI_INTERNALS__.invoke() (which the bridge shim routes to the
HTTP bridge) to inject wiki-link syntax before verifying backlink/outgoing-link APIs:
// Inject wiki-link via bridge API — saves markdown to LoroDoc BLOB and// updates the reference index, bypassing the CRDT keyboard-event limitation.await appPage.evaluate(async ({ slug, markdown }) => { await (window as any).__TAURI_INTERNALS__.invoke('save_block_content_markdown', { slug, markdown, });}, { slug: sourceSlug, markdown: `Check out [[${targetName}|${targetSlug}]]` });This pattern unblocked 3 wiki-link tests (scenarios 9, 11) that required backlinks to be established before asserting the detail panel.
When to add a bridge convenience route
- The operation exists at the application layer and has full use-case coverage
- Tests need to inject state that only the backend can construct (CRDT BLOBs, reference indices)
- Constructing the necessary input type in TypeScript would require pulling in large native libraries (
loro-crdtnpm) - The route is clearly test-only (document it as a
/// Convenience route for QA tests:comment)
Key Pattern 2: Re-wireable Workspace Context
Problem
Multi-workspace tests (initialize_workspace, open_workspace, switch_workspace) required the bridge to target a
different SQLite database after the workspace changed. The original BridgeState held all per-workspace repos as bare
fields, wired at startup time — they could not be replaced at runtime.
Solution
Extract all per-workspace repos into a WorkspaceContext struct and hold it behind a parking_lot::Mutex. Add accessor
methods that go through the mutex. switch_workspace() atomically replaces the context.
/// Hot-swappable workspace context. Rebuilt when the bridge switches workspaces.pub struct WorkspaceContext { #[allow(dead_code)] pub db: Arc<WorkspaceDatabase>, pub page_repository: Arc<SqlitePageRepository>, pub reference_repository: SqliteReferenceRepository, pub event_log_repository: Arc<SqliteEventLogRepository>, pub embedding_repository: Arc<SqliteEmbeddingRepository>,}
pub struct BridgeState { pub current_workspace: Mutex<Option<Workspace>>,
/// Hot-swappable per-workspace repos. Protected by a mutex so /// `initialize_workspace` and `open_workspace` can rewire them atomically. pub workspace_context: Mutex<WorkspaceContext>,
// ... other fields remain direct (settings, analytics, auth are not per-workspace)}
impl BridgeState { /// Get a clone of the page repository. pub fn page_repo(&self) -> SqlitePageRepository { self.workspace_context.lock().page_repository.as_ref().clone() }
/// Get an Arc to the page repository. pub fn page_repository(&self) -> Arc<SqlitePageRepository> { Arc::clone(&self.workspace_context.lock().page_repository) }
/// Get a clone of the reference repository. pub fn reference_repo(&self) -> SqliteReferenceRepository { self.workspace_context.lock().reference_repository.clone() }
/// Switch the active workspace — rebuilds all per-workspace repos atomically. pub fn switch_workspace( &self, workspace: Workspace, ) -> Result<(), Box<dyn std::error::Error>> { let new_ctx = WorkspaceContext::open(&workspace.path)?; *self.workspace_context.lock() = new_ctx; *self.current_workspace.lock() = Some(workspace); // Invalidate the search router — it holds Arcs to the old repos. *self.search_router.lock() = None; Ok(()) }}All route handlers that previously accessed repos directly now go through the accessor methods:
// Before:let page = state.page_repository.get_by_slug(&slug)?;
// After:let page = state.page_repository().get_by_slug(&slug)?;This unblocked 5 workspace-operations tests (scenarios 9, 11, 17, 25, 26) that required creating a temporary workspace, switching to it, operating, then restoring the original.
Design notes
- The mutex lock is held only for the duration of the accessor call (lock → clone → unlock). No lock is held across await points.
parking_lot::Mutexis used (nottokio::sync::Mutex) because all bridge handlers are sync; the mutex prevents TOCTOU when two requests arrive simultaneously during a workspace switch.- The search router is invalidated on switch because it holds
Arcreferences to the old repos and would otherwise serve search results from the wrong database.
Key Pattern 3: Consistent Fixme Disposition Format
Problem
Original fixme tests had comments like // TODO: figure out how to test this or no comment at all. Reviewers could not
determine whether the fixme represented a missing feature, a tooling gap, or a known architectural limitation. Over
time, fixmes became permanent without anyone tracking what would unblock them.
Solution
All fixme tests now follow a three-format classification system:
BLOCKED
Use when the test could be implemented but is missing a specific piece of infrastructure. The comment must state exactly what is missing and what would unblock it.
// BLOCKED: update_recent_workspace_icon is not exposed on the bridge and// the icon picker UI depends on Tauri-native APIs (AppHandle for menu// refresh). Requires: (1) bridge route for update_recent_workspace_icon,// (2) basic icon picker UI in the bridge-compatible frontend. New feature.test.fixme('16. Set a workspace icon', async () => { // Needs: bridge route + icon picker UI component.});PERMANENT FIXME
Use when the test cannot work at the sentinel layer due to an architectural boundary, and the behavior is adequately covered at a lower layer. Must reference the specific lower-layer tests.
// BLOCKED: Wiki-link pill rendering requires LoroDoc with ProseMirror-compatible// node structure. save_block_content_markdown stores plain text in the CRDT blob;// the editor does not parse [[...]] markdown syntax into pills on load.// Requires loro-crdt npm in sentinel tests OR a bridge route that constructs// ProseMirror-compatible LoroDoc nodes.//// COVERED BY: Application layer unit tests in crates/application/src/page/test.fixme('8. Ghost link resolves after target page is created', async ({ appPage }) => {// Requires Tauri runtime: verifying cross-session persistence needs the app// to shut down and restart, which is outside the scope of the bridge harness.test.fixme('37. Recent list persists across sessions (settings persistence)', async () => {COVERED BY
Use when there is an exact functional equivalent at a lower layer. Reference the specific test file and function name.
// COVERED BY: UpdateReferencesUseCase unit tests + rename integration teststest.fixme('18. Rename page propagates updated display text to linking pages', ...Classification decision tree
Is the feature implemented at the backend?├── No → BLOCKED (describe what needs to be built)├── Yes → Does the test work at the bridge/UI layer?│ ├── Yes → Convert to active (not fixme)│ └── No → Is the behavior covered at a lower layer?│ ├── Yes → COVERED BY (reference the lower-layer test)│ └── No → BLOCKED (describe what tooling/infra is missing)Key Pattern 4: Try/Finally Workspace Restoration
Problem
Tests that switch the active workspace as part of their scenario contaminate all subsequent tests in the suite. If a test leaves the bridge pointing at a temporary workspace and that workspace is deleted in cleanup, the next test fails with a missing-database error.
Solution
All workspace-switching tests follow this pattern:
- Capture the original workspace identity before the test begins
- Perform the test operation (initialize, switch, open another workspace)
- Restore the original workspace in a
finallyblock - Delete temporary directories in the same
finallyblock
test('25. Switch between two workspaces', async ({ bridgePort }) => { // Capture the original workspace so we can restore it afterwards. const origRes = await fetch(`http://localhost:${bridgePort}/invoke/get_current_workspace`, { method: 'POST', headers: { 'Content-Type': 'application/json' }, body: '{}', }); expect(origRes.ok).toBe(true); const origWorkspace = await origRes.json();
const tmpA = fs.mkdtempSync(path.join(os.tmpdir(), 'inklings-ws-a-')); const tmpB = fs.mkdtempSync(path.join(os.tmpdir(), 'inklings-ws-b-'));
try { // ... test body: initialize A and B, switch between them ... } finally { // Restore original workspace so subsequent tests are not affected. await fetch(`http://localhost:${bridgePort}/invoke/open_workspace`, { method: 'POST', headers: { 'Content-Type': 'application/json' }, body: JSON.stringify({ path: origWorkspace.path }), }); fs.rmSync(tmpA, { recursive: true, force: true }); fs.rmSync(tmpB, { recursive: true, force: true }); }});The finally block runs even if expect() throws, ensuring the bridge is always left in a known state regardless of
test outcome.
Rules
- Always use
get_current_workspaceat the start of the test to capture the original — do not hardcode the expected workspace path - Always restore with
open_workspacebefore deleting the temp directory — if the deletion runs first and the bridge’s current workspace is inside the deleted directory, subsequentopen_workspacecalls will fail - Use
fs.rmSyncwith{ recursive: true, force: true }to clean up temp directories silently even if some files were not created
Key Pattern 5: lint-staged Failure Drops Staging
Problem
During the Wave 4 commit (bridge state re-wiring), a clippy error caused the pre-commit hook to abort. The lint-staged failure silently unstaged all previously staged files. The retry committed only the clippy-fixed file, not the full changeset.
Mechanism
lint-staged uses a stash-based isolation strategy:
- Before running linters: save unstaged changes via
git stash - Run linters against only the staged snapshot
- If linters fail: restore the stash — which replaces the entire index, unstaging all previously staged files
git add file1..file16 # 16 files stagedgit commit # triggers lint-staged ├─ lint-staged: git stash # saves unstaged state ├─ lint-staged: run clippy # FAILS └─ lint-staged: git stash pop # restores pre-commit state # → ALL 16 files now UNSTAGEDgit add fixedFile # only 1 file stagedgit commit # succeeds with fewer files than intendedSolution
After any pre-commit hook failure, re-stage ALL intended files before retrying:
# After a pre-commit hook failure:git status # Confirm files are unstagedgit add path/to/file1 path/to/file2 ... path/to/fixgit commit -m "..." # Retry with full stagingFor agent teams using leader-commits mode, keep a running list of all files to stage so the full list can be re-applied after any failure.
See the full write-up at docs/solutions/workflow-issues/lint-staged-failure-drops-staged-files.md.
Results
INK-358 Remediation (5 Waves, ~10 Agent Instances)
| Metric | Start | End | Change |
|---|---|---|---|
| Active tests | 275 | 307 | +32 (+12%) |
| Fixme entries | 33 | 18 | -15 (-45%) |
| Skip entries | 0 | 0 | No change |
| Merge conflicts | — | 0 | Zero conflicts |
| Bridge routes added | — | 2 | save_block_content_markdown, update_references |
| UI improvements | — | 2 | Non-cascade delete, restore-ancestors prompt |
Fixme Disposition Summary
Of the original 33 fixme tests:
- 15 converted to active: Via bridge route additions, re-wireable workspace context, or UI improvements
- 12 re-dispositioned as BLOCKED: Now have explicit unblock criteria (missing bridge route, missing frontend component)
- 6 re-dispositioned as COVERED BY/PERMANENT: Architectural boundary (CRDT pill rendering, session persistence) with lower-layer coverage references
Zero fixme tests remain with “TODO” or empty comments — all have actionable disposition.
Execution Checklist
Before starting sentinel remediation:
- Run
pnpm test:e2eand capture baseline counts (active, fixme, skip, flaky) - Categorize all fixme tests: bridge gap / UI gap / CRDT boundary / cross-session / truly untestable
- Identify bridge routes that would unblock the most tests (highest yield first)
- Identify any BridgeState structural limitations blocking multi-state tests
- Plan wave ordering: infrastructure improvements before test conversions that depend on them
During execution:
- Wave 1 (quick wins): test-only fixes with no infrastructure changes — zero blocking risk
- Wave 1 (bridge): add convenience routes needed by subsequent waves
- Wave 2 (content injection): convert tests that needed the new bridge routes
- Wave 3+ (UI gaps): add missing UI components or improved interaction paths
- Wave 4 (state gaps): any BridgeState structural changes needed for test scenarios
- Final wave (disposition): update all remaining fixme comments to BLOCKED/COVERED format
After each wave:
- Run
pnpm test:e2e— verify active count increased and no regressions - Check
git statusbefore committing — verify full expected changeset is staged - After any pre-commit hook failure, re-stage ALL intended files before retrying
References
- Branch:
feat/ink-358-sentinel-conversion - Key commits:
efc83f8—feat(bridge): add update_references + save_block_content_markdown routes2165e09—feat(bridge): add re-wireable workspace context for multi-workspace support7a9ab60—feat(qa): convert wiki-link backlink/outgoing tests to API-driven approach9e6d13d—feat(qa): unskip workspace operation tests 9, 11, 17, 25, 266b84036—docs(qa): update fixme dispositions with BLOCKED format and coverage refs
- Bridge state:
apps/http-bridge/src/state.rs—WorkspaceContextandBridgeState - Bridge routes:
apps/http-bridge/src/routes/page.rs—save_block_content_markdown,update_references - Test files:
tests/e2e/tests/editor-wiki-links.spec.ts— API-driven backlink injectiontests/e2e/tests/workspace-operations.spec.ts— multi-workspace with try/finally
- Related solutions:
docs/solutions/workflow-issues/lint-staged-failure-drops-staged-files.mddocs/solutions/patterns/partitioned-test-execution-backend-isolation.mddocs/solutions/workflow-issues/agent-team-orchestration-at-scale.md
Parallel Agent Orphaned Module Wiring Next
Worktree Cleanup CWD Trap: Shell CWD Inside Deleted Worktree Is Unrecoverable
Was this page helpful?
Thanks for your feedback!