QA Testing System
Linear Epic: INK-358 (Sentinel Conversion) Status: Playwright Test Agents with Sentinel wrapper Depends On:
HTTP Bridge (apps/http-bridge/), Vite dev server, Playwright
Overview
The QA testing system uses Playwright Test Agents wrapped by @qa-sentinel skills for browser-based testing of the full Inklings UI against the real Rust backend, without requiring a Tauri runtime.
Test Pyramid
A standalone HTTP bridge replaces Tauri’s IPC, and a browser shim (bridge-inject.js) intercepts the frontend’s
invoke() calls, routing them to the bridge via HTTP.
Two-Layer Architecture
- @qa-sentinel (skill layer): Manages the testing lifecycle — discovery, scaffolding, drift detection
- Playwright Test Agents (execution layer): Test planning, code generation, execution, and self-healing
Key principle: AI is used at authoring and maintenance time only. Test execution is npx playwright test —
deterministic, fast, no LLM costs.
Architecture
Playwright (headless browser)├── Vite dev server (localhost:1420)│ └── React frontend + bridge-inject.js shim│ └── HTTP bridge (localhost:9990–9993)└── Axum → BridgeState → Use Cases → SQLiteHow It Works
- Vite serves the React frontend at
localhost:1420(no Tauri) - bridge-inject.js is injected via Playwright’s
addInitScriptbefore any page scripts load - The shim replaces
window.__TAURI_INTERNALS__.invokewith HTTPfetch()calls to the bridge - HTTP bridge receives commands at
POST /invoke/{command}, executes them against real SQLite, returns JSON - Playwright drives the UI via standard locators and assertions
Layer Mapping
| Layer | Component | Location |
|---|---|---|
| Browser shim | bridge-inject.js | apps/http-bridge/bridge-inject.js |
| HTTP server | Axum router + route handlers | apps/http-bridge/src/ |
| Application state | BridgeState (mirrors Tauri AppState) | apps/http-bridge/src/state.rs |
| Route handlers | 83 commands across 7 modules | apps/http-bridge/src/routes/ |
| NL test specs | Natural language scenario definitions | tests/e2e/specs/ |
| Generated tests | Playwright .spec.ts files | tests/e2e/tests/ |
| Sentinel config | Project + partition config | tests/e2e/sentinel.config.yaml |
HTTP Bridge
Building and Starting
# Buildcargo build -p http-bridge
# Start with a workspacecargo run -p http-bridge -- --port 9990 --workspace .data/workspaces/qa-pilot
# Verifycurl -s http://localhost:9990/health # → "ok"CLI Flags
| Flag | Default | Description |
|---|---|---|
--port | 9990 | HTTP listen port |
--workspace | .data/workspaces/default | Workspace directory (auto-initialized) |
Command Tiers
| Tier | Count | Description |
|---|---|---|
| Tier 1 (Full) | ~70 | Fully implemented with real backend logic |
| Tier 2 (Stub) | ~10 | Return sensible defaults (e.g., get_settings) |
| Tier 3 (501) | ~2 | Not applicable outside Tauri (get_sync_status, refresh_recent_workspaces_menu) |
Serialization Convention
The bridge uses #[serde(rename_all = "camelCase")] on all Deserialize structs with multi-word fields, matching
Tauri’s automatic camelCase ↔ snake_case translation. This is critical — the React frontend sends camelCase JSON.
Running Tests
Prerequisites
- Rust toolchain (for bridge binary)
- Node.js + pnpm (for Vite dev server)
- Playwright installed in
tests/e2e/
Quick Start
pnpm test:e2e handles bridge startup automatically via Playwright webServer config.
# Run all tests (from project root)pnpm test:e2e
# Or from tests/e2e/cd tests/e2e && pnpm testPartitioned Execution
Tests run across 4 isolated bridge instances for data isolation:
| Partition | Bridge Port | Test Scope |
|---|---|---|
| workspace-pages | 9990 | workspace-*.spec.ts |
| editor-content | 9991 | editor-*.spec.ts |
| navigation-search | 9992 | navigation-*.spec.ts |
| advanced-features | 9993 | advanced-*.spec.ts |
Configuration in tests/e2e/playwright.config.ts maps each project to its bridge port and test directory.
Bridge Shim Injection
The shim is injected via Playwright’s addInitScript in the test fixtures. It must run before the React app loads:
// Prepend to bridge-inject.js contents:window.__BRIDGE_BASE_URL = "http://localhost:9990";
// Then the full bridge-inject.js contentsSupabase Stub Environment
The frontend requires SUPABASE_URL and SUPABASE_PUBLISHABLE_KEY at build time. Use stub values:
SUPABASE_URL="http://localhost:54321"SUPABASE_PUBLISHABLE_KEY="stub-anon-key-for-qa"The bridge itself falls back to a stub SupabaseConfig when env vars aren’t set and uses in-memory auth.
Sentinel Skills
Three skills manage the testing lifecycle:
| Skill | Purpose |
|---|---|
qa-sentinel:setup | Scaffold project: Playwright agents, NL spec dirs, config |
qa-sentinel:discover | Critical path + persona analysis → NL test specs |
qa-sentinel:audit | Drift detection (NL specs ↔ app ↔ tests) → corrections |
NL Spec Format
Natural language specs in tests/e2e/specs/ are the durable contract between discovery and test generation. They use
YAML frontmatter + markdown with numbered H3 scenarios. See any spec file for the canonical format.
Test Generation Pipeline
NL Specs → Playwright Planner → Test Plans → Playwright Generator → .spec.tsGenerated .spec.ts files are regenerable from specs. NL specs survive Playwright API changes and UI restructures.
Known Bridge Gaps
| Gap | Reason | Impact |
|---|---|---|
Tauri events (emit/listen) | Stubbed — no event bus in bridge | Event-driven UI features won’t fire |
| Sync lifecycle | No real sync engine in bridge | Sync status always “not connected” |
| Native dialogs | No Tauri window API | File browse dialogs don’t open |
| Window menu refresh | No native menu | Recent workspaces menu not updated |
| React strict mode | Double-invokes effects in dev | First initialize_workspace may hit “database is locked” — second succeeds |
Adding New Commands
When new Tauri commands are added to apps/desktop/src-tauri/src/commands/:
- Add the route handler in the appropriate
apps/http-bridge/src/routes/*.rsmodule - Register the route in
apps/http-bridge/src/router.rs - Add
#[serde(rename_all = "camelCase")]to anyDeserializestructs with multi-word fields - Test via
curl -s -X POST localhost:9990/invoke/{command}before browser testing
This manual sync is tracked as a maintenance concern (see INK-234 for macro-based generation).
Tauri-Native Smoke Tests
These scenarios run in the real Tauri app (not the HTTP bridge) to verify behaviors the bridge cannot replicate: event propagation, background workers, and native dialogs.
Prerequisites
- Real Tauri app launched via
tools/qa/start-tauri-smoke.sh - Safari Web Inspector connected to the Tauri webview
Connecting to the Tauri Webview
On macOS, Tauri uses WKWebView — Chrome DevTools Protocol (CDP) cannot connect to it. Use Safari Web Inspector:
- Open Safari
- Safari menu → Settings → Advanced → check “Show features for web developers”
- Safari menu → Develop → {machine name} → localhost
- Use the Web Inspector console for debugging and verification
Smoke Test Scenarios
| # | Scenario | Validates | Automation |
|---|---|---|---|
| 1 | Open workspace → sidebar populates | Event propagation → React state | Partial — verify page count in DB |
| 2 | Create page → embedding status | EmbeddingManager signal | Partial — check embedding queue |
| 3 | Export pages → native dialog | Native dialog integration | Manual — visual confirmation |
INK-358 Sentinel Conversion — Complete
Status: Done (2026-02-24)
All 7 child issues resolved:
- INK-405: Foundation specs and HTTP bridge scaffolding ✓
- INK-406: Phase 1 tests (6 spec files) ✓
- INK-407: Phase 2 tests (4 spec files) ✓
- INK-408: Phase 3 tests (4 spec files) ✓
- INK-409: Phase 4 cleanup and finalization ✓
- INK-410: Sentinel-driven test maintenance ✓
- INK-411: Playwright agents and healing ✓
Final test suite contains 301 tests across 17 spec files with a 90% pass rate (271 active tests passing). Fixme tests (25) are tracked by category; skip tests (5) correspond to unimplemented UI. The suite is production-ready for pre-release validation.
Pre-Release Gate
A quality gate that must pass before any release. Run as part of the release checklist.
Gate Process
# 1. Run full test suite (Playwright handles bridge startup)pnpm test:e2e
# 2. Review results# - All tests must PASS or be in a known fixme/skip state# - Zero unexpected failures# - Check: test-results/.last-run.json should show "status": "passed"
# 3. Generate HTML report for review (optional)cd tests/e2e && npx playwright test --reporter=html# Open playwright-report/index.htmlGate Criteria
| Criterion | Threshold | Notes |
|---|---|---|
| Pass rate (runnable) | 100% | All non-fixme, non-skip tests must pass |
| No new failures | Zero | Compare against previous run |
| Fixme tests reviewed | Yes | Ensure no fixme tests should be re-enabled |
| Bridge health | All 4 UP | All partition bridge instances responding |
Suite Inventory (INK-358 Complete)
| Metric | Count |
|---|---|
| Total tests | 301 |
| Active (passing) | 271 |
| Fixme (known limitation) | 25 |
| Skip (no UI built) | 5 |
| Test files | 17 spec files |
| Suite duration | ~2.5 min |
Fixme Breakdown (25 tests):
- Loro CRDT: 12 tests
- Import edge cases: 7 tests
- Multi-workspace lifecycle: 3 tests
- Cloud-only: 2 tests
- FTS snippet: 1 test
Skip Breakdown (5 tests):
- Page history sub-view unimplemented: 5 tests
Test File Distribution (17 spec files):
| Spec File | Active | Fixme | Skip | Notes |
|---|---|---|---|---|
| smoke-test.spec.ts | 8 | — | — | |
| navigation-command-palette.spec.ts | 14 | — | — | |
| workspace-trash.spec.ts | 13 | — | — | |
| navigation-first-launch.spec.ts | 12 | — | — | |
| navigation-sidebar.spec.ts | 11 | — | — | |
| advanced-status-displays.spec.ts | 31 | 2 | — | |
| workspace-operations.spec.ts | 38 | 3 | — | |
| navigation-first-launch-tour.spec.ts | 9 | — | — | serial |
| workspace-page-hierarchy.spec.ts | 20 | — | — | |
| advanced-import.spec.ts | 10 | 7 | — | serial |
| navigation-search.spec.ts | 14 | 1 | — | serial |
| editor-persistence.spec.ts | 12 | 2 | — | |
| editor-formatting.spec.ts | 20 | 1 | — | |
| editor-wiki-links.spec.ts | 16 | 7 | — | serial |
| workspace-page-crud.spec.ts | 25 | 1 | — | |
| editor-page-attributes.spec.ts | 17 | 2 | — | |
| TOTAL | 250 | 25 | 0 |
When to Run
- Before every release build
- After significant UI or backend changes
- After dependency upgrades (Playwright, Tauri, Loro)
- Weekly as a regression check (recommended)
Failure Response
| Failure Type | Action |
|---|---|
| Locator/selector mismatch | Run Playwright Healer or fix manually |
| Timing/race condition | Increase timeout or add waitForSelector |
| App regression | File a Linear issue (Team: Inklings, Label: Type:Bug) |
| Bridge gap (new) | Mark as test.fixme() with explanation |
| Infrastructure failure | Check bridge health, restart infrastructure |
Audit Integration
Run /qa-sentinel:audit periodically to detect:
- Spec → Test drift: NL spec scenarios without matching tests
- Test → Spec drift: Orphaned tests without NL spec backing
- Behavioral drift: Test failures indicating app changes vs stale specs
Infrastructure Scripts
Infrastructure startup is handled automatically by Playwright’s webServer config. precheck.sh is the canonical backend
quality gate.
| Script | Purpose |
|---|---|
tools/precheck.sh | Backend quality gate (clippy, fmt, lint, test) |
tools/qa/stop-infrastructure.sh | Emergency cleanup of orphaned bridge processes |
tools/qa/health-check.sh | Verify bridge health (/health endpoint) |
tools/qa/start-tauri-smoke.sh | Start real Tauri app for native smoke tests |
tools/qa/bridge-query.sh | Test bridge endpoint (curl wrapper) |
Was this page helpful?
Thanks for your feedback!