Skip to content
Documentation GitHub
Reference

QA Testing System

Linear Epic: INK-358 (Sentinel Conversion) Status: Playwright Test Agents with Sentinel wrapper Depends On: HTTP Bridge (apps/http-bridge/), Vite dev server, Playwright


Overview

The QA testing system uses Playwright Test Agents wrapped by @qa-sentinel skills for browser-based testing of the full Inklings UI against the real Rust backend, without requiring a Tauri runtime.

Test Pyramid

A standalone HTTP bridge replaces Tauri’s IPC, and a browser shim (bridge-inject.js) intercepts the frontend’s invoke() calls, routing them to the bridge via HTTP.

Two-Layer Architecture

  • @qa-sentinel (skill layer): Manages the testing lifecycle — discovery, scaffolding, drift detection
  • Playwright Test Agents (execution layer): Test planning, code generation, execution, and self-healing

Key principle: AI is used at authoring and maintenance time only. Test execution is npx playwright test — deterministic, fast, no LLM costs.

Architecture

Playwright (headless browser)
├── Vite dev server (localhost:1420)
│ └── React frontend + bridge-inject.js shim
│ └── HTTP bridge (localhost:9990–9993)
└── Axum → BridgeState → Use Cases → SQLite

How It Works

  1. Vite serves the React frontend at localhost:1420 (no Tauri)
  2. bridge-inject.js is injected via Playwright’s addInitScript before any page scripts load
  3. The shim replaces window.__TAURI_INTERNALS__.invoke with HTTP fetch() calls to the bridge
  4. HTTP bridge receives commands at POST /invoke/{command}, executes them against real SQLite, returns JSON
  5. Playwright drives the UI via standard locators and assertions

Layer Mapping

LayerComponentLocation
Browser shimbridge-inject.jsapps/http-bridge/bridge-inject.js
HTTP serverAxum router + route handlersapps/http-bridge/src/
Application stateBridgeState (mirrors Tauri AppState)apps/http-bridge/src/state.rs
Route handlers83 commands across 7 modulesapps/http-bridge/src/routes/
NL test specsNatural language scenario definitionstests/e2e/specs/
Generated testsPlaywright .spec.ts filestests/e2e/tests/
Sentinel configProject + partition configtests/e2e/sentinel.config.yaml

HTTP Bridge

Building and Starting

Terminal window
# Build
cargo build -p http-bridge
# Start with a workspace
cargo run -p http-bridge -- --port 9990 --workspace .data/workspaces/qa-pilot
# Verify
curl -s http://localhost:9990/health # → "ok"

CLI Flags

FlagDefaultDescription
--port9990HTTP listen port
--workspace.data/workspaces/defaultWorkspace directory (auto-initialized)

Command Tiers

TierCountDescription
Tier 1 (Full)~70Fully implemented with real backend logic
Tier 2 (Stub)~10Return sensible defaults (e.g., get_settings)
Tier 3 (501)~2Not applicable outside Tauri (get_sync_status, refresh_recent_workspaces_menu)

Serialization Convention

The bridge uses #[serde(rename_all = "camelCase")] on all Deserialize structs with multi-word fields, matching Tauri’s automatic camelCase ↔ snake_case translation. This is critical — the React frontend sends camelCase JSON.

Running Tests

Prerequisites

  • Rust toolchain (for bridge binary)
  • Node.js + pnpm (for Vite dev server)
  • Playwright installed in tests/e2e/

Quick Start

pnpm test:e2e handles bridge startup automatically via Playwright webServer config.

Terminal window
# Run all tests (from project root)
pnpm test:e2e
# Or from tests/e2e/
cd tests/e2e && pnpm test

Partitioned Execution

Tests run across 4 isolated bridge instances for data isolation:

PartitionBridge PortTest Scope
workspace-pages9990workspace-*.spec.ts
editor-content9991editor-*.spec.ts
navigation-search9992navigation-*.spec.ts
advanced-features9993advanced-*.spec.ts

Configuration in tests/e2e/playwright.config.ts maps each project to its bridge port and test directory.

Bridge Shim Injection

The shim is injected via Playwright’s addInitScript in the test fixtures. It must run before the React app loads:

// Prepend to bridge-inject.js contents:
window.__BRIDGE_BASE_URL = "http://localhost:9990";
// Then the full bridge-inject.js contents

Supabase Stub Environment

The frontend requires SUPABASE_URL and SUPABASE_PUBLISHABLE_KEY at build time. Use stub values:

Terminal window
SUPABASE_URL="http://localhost:54321"
SUPABASE_PUBLISHABLE_KEY="stub-anon-key-for-qa"

The bridge itself falls back to a stub SupabaseConfig when env vars aren’t set and uses in-memory auth.

Sentinel Skills

Three skills manage the testing lifecycle:

SkillPurpose
qa-sentinel:setupScaffold project: Playwright agents, NL spec dirs, config
qa-sentinel:discoverCritical path + persona analysis → NL test specs
qa-sentinel:auditDrift detection (NL specs ↔ app ↔ tests) → corrections

NL Spec Format

Natural language specs in tests/e2e/specs/ are the durable contract between discovery and test generation. They use YAML frontmatter + markdown with numbered H3 scenarios. See any spec file for the canonical format.

Test Generation Pipeline

NL Specs → Playwright Planner → Test Plans → Playwright Generator → .spec.ts

Generated .spec.ts files are regenerable from specs. NL specs survive Playwright API changes and UI restructures.

Known Bridge Gaps

GapReasonImpact
Tauri events (emit/listen)Stubbed — no event bus in bridgeEvent-driven UI features won’t fire
Sync lifecycleNo real sync engine in bridgeSync status always “not connected”
Native dialogsNo Tauri window APIFile browse dialogs don’t open
Window menu refreshNo native menuRecent workspaces menu not updated
React strict modeDouble-invokes effects in devFirst initialize_workspace may hit “database is locked” — second succeeds

Adding New Commands

When new Tauri commands are added to apps/desktop/src-tauri/src/commands/:

  1. Add the route handler in the appropriate apps/http-bridge/src/routes/*.rs module
  2. Register the route in apps/http-bridge/src/router.rs
  3. Add #[serde(rename_all = "camelCase")] to any Deserialize structs with multi-word fields
  4. Test via curl -s -X POST localhost:9990/invoke/{command} before browser testing

This manual sync is tracked as a maintenance concern (see INK-234 for macro-based generation).


Tauri-Native Smoke Tests

These scenarios run in the real Tauri app (not the HTTP bridge) to verify behaviors the bridge cannot replicate: event propagation, background workers, and native dialogs.

Prerequisites

  • Real Tauri app launched via tools/qa/start-tauri-smoke.sh
  • Safari Web Inspector connected to the Tauri webview

Connecting to the Tauri Webview

On macOS, Tauri uses WKWebView — Chrome DevTools Protocol (CDP) cannot connect to it. Use Safari Web Inspector:

  1. Open Safari
  2. Safari menu → Settings → Advanced → check “Show features for web developers”
  3. Safari menu → Develop → {machine name} → localhost
  4. Use the Web Inspector console for debugging and verification

Smoke Test Scenarios

#ScenarioValidatesAutomation
1Open workspace → sidebar populatesEvent propagation → React statePartial — verify page count in DB
2Create page → embedding statusEmbeddingManager signalPartial — check embedding queue
3Export pages → native dialogNative dialog integrationManual — visual confirmation

INK-358 Sentinel Conversion — Complete

Status: Done (2026-02-24)

All 7 child issues resolved:

  • INK-405: Foundation specs and HTTP bridge scaffolding ✓
  • INK-406: Phase 1 tests (6 spec files) ✓
  • INK-407: Phase 2 tests (4 spec files) ✓
  • INK-408: Phase 3 tests (4 spec files) ✓
  • INK-409: Phase 4 cleanup and finalization ✓
  • INK-410: Sentinel-driven test maintenance ✓
  • INK-411: Playwright agents and healing ✓

Final test suite contains 301 tests across 17 spec files with a 90% pass rate (271 active tests passing). Fixme tests (25) are tracked by category; skip tests (5) correspond to unimplemented UI. The suite is production-ready for pre-release validation.


Pre-Release Gate

A quality gate that must pass before any release. Run as part of the release checklist.

Gate Process

Terminal window
# 1. Run full test suite (Playwright handles bridge startup)
pnpm test:e2e
# 2. Review results
# - All tests must PASS or be in a known fixme/skip state
# - Zero unexpected failures
# - Check: test-results/.last-run.json should show "status": "passed"
# 3. Generate HTML report for review (optional)
cd tests/e2e && npx playwright test --reporter=html
# Open playwright-report/index.html

Gate Criteria

CriterionThresholdNotes
Pass rate (runnable)100%All non-fixme, non-skip tests must pass
No new failuresZeroCompare against previous run
Fixme tests reviewedYesEnsure no fixme tests should be re-enabled
Bridge healthAll 4 UPAll partition bridge instances responding

Suite Inventory (INK-358 Complete)

MetricCount
Total tests301
Active (passing)271
Fixme (known limitation)25
Skip (no UI built)5
Test files17 spec files
Suite duration~2.5 min

Fixme Breakdown (25 tests):

  • Loro CRDT: 12 tests
  • Import edge cases: 7 tests
  • Multi-workspace lifecycle: 3 tests
  • Cloud-only: 2 tests
  • FTS snippet: 1 test

Skip Breakdown (5 tests):

  • Page history sub-view unimplemented: 5 tests

Test File Distribution (17 spec files):

Spec FileActiveFixmeSkipNotes
smoke-test.spec.ts8
navigation-command-palette.spec.ts14
workspace-trash.spec.ts13
navigation-first-launch.spec.ts12
navigation-sidebar.spec.ts11
advanced-status-displays.spec.ts312
workspace-operations.spec.ts383
navigation-first-launch-tour.spec.ts9serial
workspace-page-hierarchy.spec.ts20
advanced-import.spec.ts107serial
navigation-search.spec.ts141serial
editor-persistence.spec.ts122
editor-formatting.spec.ts201
editor-wiki-links.spec.ts167serial
workspace-page-crud.spec.ts251
editor-page-attributes.spec.ts172
TOTAL250250

When to Run

  • Before every release build
  • After significant UI or backend changes
  • After dependency upgrades (Playwright, Tauri, Loro)
  • Weekly as a regression check (recommended)

Failure Response

Failure TypeAction
Locator/selector mismatchRun Playwright Healer or fix manually
Timing/race conditionIncrease timeout or add waitForSelector
App regressionFile a Linear issue (Team: Inklings, Label: Type:Bug)
Bridge gap (new)Mark as test.fixme() with explanation
Infrastructure failureCheck bridge health, restart infrastructure

Audit Integration

Run /qa-sentinel:audit periodically to detect:

  • Spec → Test drift: NL spec scenarios without matching tests
  • Test → Spec drift: Orphaned tests without NL spec backing
  • Behavioral drift: Test failures indicating app changes vs stale specs

Infrastructure Scripts

Infrastructure startup is handled automatically by Playwright’s webServer config. precheck.sh is the canonical backend quality gate.

ScriptPurpose
tools/precheck.shBackend quality gate (clippy, fmt, lint, test)
tools/qa/stop-infrastructure.shEmergency cleanup of orphaned bridge processes
tools/qa/health-check.shVerify bridge health (/health endpoint)
tools/qa/start-tauri-smoke.shStart real Tauri app for native smoke tests
tools/qa/bridge-query.shTest bridge endpoint (curl wrapper)

Was this page helpful?