Platform

Sync System

Phase: Multi-Device Sync Depends On: Workspace System, Page System

Overview

The Sync System enables multi-device synchronization using Supabase as the cloud backend. Content changes are pushed to and pulled from Supabase Postgres tables; Supabase Realtime WebSocket subscriptions wake the sync loop when remote changes arrive.

Sync operates at three distinct levels, each with its own merge strategy:

Block CRDT merge — Loro document updates (push + pull, commutative merge)
Page metadata LWW — Title, parent, page type, icon (Last-Write-Wins by timestamp)
Page deletion tombstones — Soft-delete propagation with cascade and conflict detection

Sync is opt-in per workspace. A workspace must be registered in the cloud (cloud_id) before sync is enabled.

Diagram

Architecture

Layer Boundaries

The sync system follows the project’s Clean Architecture pattern strictly — the application layer (crates/application/src/sync/) defines all traits; infrastructure implements them.

Application Layer (crates/application/src/sync/)
  ├── SyncEngine         — State machine (push + pull cycle)
  ├── Use Cases          — Individual operations, individually testable
  └── Service Traits     — Contracts for infrastructure to implement

Infrastructure (SQLite) (crates/infrastructure/sqlite/src/sync/)
  ├── SqliteBlockStorageRepository   — CRDT snapshots + sync cursor
  ├── SqliteOfflineQueueRepository   — Offline push queue + dead letters
  ├── SqliteMetadataStorageRepository — Page metadata sync state
  ├── SqliteDeletionStorageRepository — Tombstone tracking
  └── SqliteLocalWorkspaceSyncRepository — Cloud ID + enabled flag

Infrastructure (Supabase) (crates/infrastructure/supabase/src/sync/)
  ├── SupabaseSyncRepository         — Push/pull block updates
  ├── SupabaseMetadataSyncRepository — Push/pull page metadata
  ├── SupabaseDeletionSyncRepository — Push/pull tombstones
  └── SupabaseRealtimeClient         — WebSocket subscription + wake signal

Database Location

Per-workspace SQLite database at {workspace_path}/.inklings/inklings.db.

Sync-related tables:

blocks.content_loro — Loro CRDT snapshot (source of truth for block content)
blocks.content — Materialized text (debug/search; derived from snapshot)
pages.raw_markdown — Full-page text for FTS5 indexing (updated on pull)
sync_state — Per-block version vectors + sync cursor (stored under __sync_cursor__ key)
sync_queue — Offline push queue
sync_dead_letters — Queue entries that exceeded max retries
page_metadata_sync — Per-field LWW tracking (changed_at + device_id)
page_tombstones — Local tombstone log

Three Sync Levels

1. Block CRDT Merge

Block content is stored as Loro CRDT documents. Loro uses operations-based CRDTs — incremental update bytes can be applied in any order and produce the same result.

Push path: On every save, EnqueueBlockUpdateUseCase appends an incremental Loro update to the sync_queue table. The sync engine’s push phase dequeues batches (up to push_batch_size = 50) and calls SyncRepository::push_block_update to write them to Supabase’s block_updates table.

Pull path: The pull phase calls SyncRepository::pull_block_updates with since_cursor, retrieving new rows ordered by their server-assigned row ID. For each remote block, the engine loads the local Loro snapshot, applies remote updates via LoroMerger::apply_updates, extracts materialized text, and writes back via BlockStorageRepository::save_block_snapshot. Both blocks.content_loro and pages.raw_markdown are updated atomically.

Self-filter: Updates from the current device are skipped during merge (the device already has those changes locally). The cursor still advances past self-updates to avoid re-fetching them.

Key types:

RemoteBlockUpdate — Server row ID (cursor), block ID, Loro update bytes, device ID, sequence number
QueuedUpdate — Queue row ID, block ID, update bytes, retry count, version vector
LoroMerger trait — apply_updates, snapshot_to_text, get_version_vector, empty_snapshot

2. Page Metadata LWW

Page metadata fields (title, parent_slug, page_type, icon, icon_color, template, sort_order) are synchronized using Last-Write-Wins conflict resolution. Each field is tracked independently.

Push path: PushPageMetadataUseCase reads pending fields from the MetadataSyncQueueRepository and calls MetadataSyncRepository::push_page_metadata with the field name, value, changed_at timestamp, and device_id.

Pull path: PullPageMetadataUseCase fetches remote changes since an ISO-8601 cursor timestamp. For each field, it compares the remote changed_at with the local value’s timestamp:

If no local value exists: apply remote
If remote changed_at is strictly later: apply remote
If timestamps are equal: higher device_id (lexicographic) wins
Otherwise: keep local

Critical invariant: The remote changed_at and device_id must be passed through to storage unchanged. Never substitute the local apply time — doing so corrupts the LWW causal ordering for all downstream devices.

Allowlisted field-to-column mapping in SqliteMetadataStorageRepository maps known field names to pages table columns. Unknown fields are stored in page_metadata_sync only and not applied to the page.

Key types:

RemotePageMetadata — page ID, field name, value (nullable), changed_at, device_id
QueuedMetadataField — page ID, field, value, changed_at

3. Page Deletion Tombstones

Page deletions are propagated as tombstones — a record that a page was deleted, keyed by page ID, timestamped, and attributed to the originating device.

Push path: When a page is deleted locally, PushPageDeletionUseCase pushes a tombstone to DeletionSyncRepository::push_page_tombstone.

Pull path: PullPageDeletionsUseCase fetches tombstones since a cursor timestamp. For each tombstone:

Check if the page exists locally
If not: record the tombstone but skip deletion
If yes: detect conflicts (page has local edits since tombstone timestamp), then apply soft-delete regardless (“delete wins”)
Record the tombstone locally for audit

Cascade semantics: Remote deletions apply the same cascade as local deletes — all descendants are soft-deleted atomically via a recursive CTE, and their pending sync_queue entries are removed to prevent re-pushing updates for deleted pages.

WITH RECURSIVE descendants(slug) AS (
    SELECT slug FROM pages WHERE slug = ?1 AND is_deleted = 0
    UNION ALL
    SELECT p.slug FROM pages p
    INNER JOIN descendants d ON p.parent_slug = d.slug
    WHERE p.is_deleted = 0
)
UPDATE pages SET is_deleted = 1, deleted_at = strftime('%Y-%m-%dT%H:%M:%f', 'now')
WHERE slug IN (SELECT slug FROM descendants)

Key types:

RemotePageTombstone — page ID, page title, deleted_by_device_id, deleted_at
QueuedPageDeletion — page ID, page title

Sync Engine State Machine

SyncEngine (crates/application/src/sync/sync_engine.rs) is the central orchestrator. It is a pure application-layer component with no framework dependencies; the framework layer (SyncManager in Tauri) drives it in a tokio background task.

Status

pub enum SyncStatus {
    Idle,     // Waiting for next sync interval
    Syncing,  // Cycle in progress
    Offline,  // Remote unreachable; retrying with backoff
    Disabled, // No cloud identity configured
}

Configuration

Default values (from SyncConfig::default()):

Parameter	Default	Description
`sync_interval`	5s	Time between cycles when online
`push_batch_size`	50	Max queue entries pushed per cycle
`pull_batch_size`	100	Max remote updates pulled per cycle
`initial_backoff`	1s	First retry delay when offline
`max_backoff`	60s	Ceiling for exponential backoff

Cycle Phases

Each call to sync_cycle() runs six phases in order:

Push phases (local → cloud):

Push block updates — drain sync_queue to Supabase block_updates
Push page metadata — flush pending field changes
Push page deletions — push tombstones

Pull phases (cloud → local): 4. Pull block updates — fetch new rows, CRDT merge, update cursor 5. Pull page metadata — fetch field changes, LWW merge, advance cursor 6. Pull page deletions — fetch tombstones, cascade soft-delete, advance cursor

Offline Detection and Backoff

After each cycle, the engine evaluates all push results:

If there were push attempts and every single one failed: status → Offline, consecutive_failures += 1
If at least one operation succeeded: status → Idle, consecutive_failures = 0

When offline, next_delay() returns an exponentially increasing backoff duration:

backoff = initial_backoff * 2^min(consecutive_failures, 6)

Capped at max_backoff. When mark_online() is called (e.g. after a manual force-sync), the counter resets immediately.

Phase Failure Semantics

Each phase returns Err only when all operations within it fail (not partial failure). If only some items fail, the phase returns Ok with separate pushed_count and push_failed_count fields. This means a single bad item doesn’t abort the cycle.

Failed queue entries have their retry_count incremented via SyncQueueRepository::mark_failed. Entries exceeding the max retry cap are moved to the dead letter table and no longer block the queue.

Cursor Safety

The cursor represents “successfully processed up to this point”, not “fetched”.

During the block pull phase, updates for different blocks may arrive interleaved in server ID order (e.g. id=1 block-A, id=2 block-B, id=3 block-A). If block-A fails to merge, advancing the cursor past all three would permanently skip id=1 and id=3.

The engine tracks this with two accumulators:

max_update_id — highest ID seen across all blocks (for the success case)
min_failed_id — lowest ID from any failed block

Safe cursor = min(failed_ids) - 1 when any block fails; otherwise max_update_id. The cursor never decreases below its previous value.

Self-updates (same device_id) are filtered from the merge step but still advance the cursor — the device already has these changes locally, so skipping the merge is safe, but the cursor must move past them to avoid re-fetching.

LWW Patterns

Timestamp Pass-Through

In any LWW system, the changed_at timestamp represents when the original edit occurred, not when this device learned about it. Any layer that regenerates the timestamp breaks causal ordering for all downstream devices.

The apply_metadata_update trait method signature explicitly accepts changed_at: &str and device_id: &str as parameters that must flow through to storage unchanged:

fn apply_metadata_update(
    &self,
    workspace_path: &Path,
    page_id: &str,
    field: &str,
    value: Option<&str>,
    changed_at: &str,   // Remote timestamp — preserve unchanged
    device_id: &str,    // Remote device — preserve unchanged
) -> SyncResult<()>;

The SQLite implementation stores ?4 and ?5 directly rather than substituting strftime('now').

Tie-Breaking

When two devices write the same field at the same timestamp, the tie is broken by device_id — the lexicographically higher device ID wins. This is deterministic and produces the same result on all devices without coordination.

Realtime Subscriptions

SupabaseRealtimeClient implements RealtimeSubscriptionProvider. It subscribes to postgres_changes events on three Supabase tables: block_updates, page_metadata, and page_tombstones.

The application layer sees only the RealtimeSubscriptionProvider trait:

pub trait RealtimeSubscriptionProvider: Send + Sync {
    fn start(&self, workspace_cloud_id: String, wake: Arc<Notify>) -> SyncResult<()>;
    fn stop(&self) -> SyncResult<()>;
    fn is_connected(&self) -> bool;
}

When an event arrives, the client fires wake.notify_one() after a 500ms debounce window. This wakes the sync loop for an immediate pull cycle rather than waiting for the next scheduled interval.

The WebSocket connection uses a reconnect backoff (separate from the sync engine’s offline backoff). Heartbeats are sent every 25 seconds to keep the connection alive.

Use Cases

Use Case	Purpose
`EnableWorkspaceSyncUseCase`	Register workspace in cloud + store cloud ID locally
`DisableWorkspaceSyncUseCase`	Mark sync disabled (preserves cloud ID for re-enable)
`GetWorkspaceSyncStatusUseCase`	Read current sync status + cloud ID
`ListCloudWorkspacesUseCase`	List all workspaces registered to the current user
`EnqueueBlockUpdateUseCase`	Add Loro update bytes to the offline push queue
`PushBlockUpdatesUseCase`	Push queued updates to Supabase
`PullBlockUpdatesUseCase`	Fetch remote updates + CRDT merge
`PushPageMetadataUseCase`	Push pending metadata field changes
`PullPageMetadataUseCase`	Fetch remote metadata changes + LWW merge
`PushPageDeletionUseCase`	Push page tombstone to cloud
`PullPageDeletionsUseCase`	Fetch tombstones + cascade soft-delete

Key Code Paths

On Block Save (local → cloud)

Editor saves Loro snapshot → SaveBlockContentUseCase (crates/application/src/page/)
EnqueueBlockUpdateUseCase appends incremental update to sync_queue
Next sync cycle: push_block_phase dequeues batch, calls SyncRepository::push_block_update
On success: mark_synced removes from queue
On failure: mark_failed increments retry_count; at max retries, moved to dead letters

On Remote Block Change (cloud → local)

Supabase Realtime fires INSERT event on block_updates
SupabaseRealtimeClient debounces 500ms, calls wake.notify_one()
Sync loop wakes, runs pull_phase
Reads cursor from sync_state.__sync_cursor__
Fetches rows with id > cursor, grouped by block_id
Filters self-updates (device_id match), loads local Loro snapshot
Applies remote update bytes via LoroMerger::apply_updates
Saves merged snapshot + materialized text via save_block_snapshot
Updates sync_state version vectors
Advances cursor to min(failed_ids) - 1 or max_update_id

On Page Metadata Change

User renames page → UpdatePageUseCase writes to pages.title
Change enqueued to MetadataSyncQueueRepository with changed_at timestamp
Next cycle: push_metadata_phase reads pending fields, pushes to Supabase
On pull: pull_metadata_phase fetches remote changes, runs LWW comparison
If remote wins: apply_metadata_update with original remote timestamps

On Page Deletion (cascade to remote)

User deletes page → DeletePageUseCase soft-deletes locally (cascade to children)
PushPageDeletionUseCase enqueues tombstone
Next cycle: push_deletions_phase pushes tombstone to Supabase
Other devices: pull_deletions_phase fetches tombstone, cascade soft-deletes locally

Error Handling

Scenario	Behavior
Network unavailable	`consecutive_failures` increments; exponential backoff up to 60s
Block merge failure	Cursor clamped below failed block; retry on next cycle
Queue entry exceeds retry cap	Moved to `sync_dead_letters`; does not block other entries
All pushes fail in a phase	Phase returns `Err`; engine transitions to `Offline`
LWW conflict	Resolved deterministically; no user intervention needed
Deletion conflict (local edits)	Logged as warning; deletion still applied (“delete wins”)

References

ADR-007: Agent Integration via MCP Server and Sync Protocol
Solution: Sync Engine Cursor Safety
Solution: LWW Metadata Sync Patterns
Solution: CRDT Binary Pass-Through Pipeline
Source: crates/application/src/sync/sync_engine.rs — SyncEngine state machine
Source: crates/application/src/sync/services.rs — All service trait definitions
Source: crates/infrastructure/sqlite/src/sync/ — SQLite implementations
Source: crates/infrastructure/supabase/src/ — Supabase push/pull/realtime

Depends on Workspace System and Page System. Used by MCP System (remote agent sync peers).

Previous
Search System Next
Task Runner System

Was this page helpful?