Platform

Model Downloader System

Status: Implemented File: apps/desktop/src-tauri/src/model_downloader.rs

Overview

The model downloader automatically fetches the snowflake-arctic-embed-m-v2.0 ONNX embedding model (~585 MB) from HuggingFace on first workspace open. This is a first-run UX concern: the model is required for semantic search but too large to bundle with the application binary.

The download is gated by the semantic_search_enabled setting (default true). When disabled, the search system falls back to FTS5-only search with no download attempt.

Model Details

Property	Value
Model	snowflake-arctic-embed-m-v2.0
Format	ONNX (FP16)
Size	~585 MB (`model.onnx`) + ~2 MB (`tokenizer.json`)
Embedding dim	768
Token context	512 tokens
Files	`model.onnx`, `tokenizer.json`
Model SHA-256	`f27ab40ab6e230265ba49a202a37f1ad031556256cbbc105d0ca9c0bdc7ec42e`
Tokenizer SHA-256	`f1cc44ad7faaeec47241864835473fd5403f2da94673f3f764a77ebcb0a803ec`
Model URL	`https://huggingface.co/Snowflake/snowflake-arctic-embed-m-v2.0/resolve/main/onnx/model_fp16.onnx`
Tokenizer URL	`https://huggingface.co/Snowflake/snowflake-arctic-embed-m-v2.0/resolve/main/tokenizer.json`

Trigger Points

The download is initiated in two scenarios:

Automatic (workspace open): When the app opens a workspace, the embedding pipeline initialization checks for model files. If missing and semantic_search_enabled is true, the download begins in the background.
Manual (settings toggle): When the user enables semantic search via the SemanticSearchToggle component, the frontend calls triggerModelDownload. The backend no-ops if the model is already present.

Settings Gate

The semantic_search_enabled field in Settings (schema v11) controls whether semantic search is active:

Default: true (opt-out design)
When true: Model is downloaded if missing; SearchRouter uses both FTS5 and semantic search with RRF merge
When false: No download attempt; search falls back to FTS5-only

Users toggle this via the SemanticSearchToggle in workspace settings, which displays the model size (~585 MB) as informed consent before download.

Download Flow

The tokenizer is downloaded first because it is small (~2 MB), giving the user immediate visual feedback that the download has started before the large model transfer begins.

Progress Events

Download progress is emitted on the embedding-model:download-progress Tauri event channel. Events are throttled to at most once per 256 KB to avoid flooding the event bus.

`ModelDownloadProgress` Payload

Field	Type	Description
`status`	`string`	`"downloading"`, `"verifying"`, `"complete"`, `"error"`, `"cancelled"`
`file`	`string`	Current file: `"model.onnx"` or `"tokenizer.json"`
`bytes_downloaded`	`u64`	Bytes downloaded so far for the current file
`total_bytes`	`u64`	Total bytes from Content-Length (0 if unknown)
`percent`	`f64`	Percentage complete (0.0 — 100.0)
`bytes_per_second`	`f64`	Current download speed
`eta_seconds`	`f64`	Estimated seconds remaining

Cancellation

The download accepts a CancellationToken (from tokio_util). Cancellation is checked inside the download loop via biased tokio::select! — the cancellation branch has priority over new data chunks.

On cancellation:

A status: "cancelled" event is emitted
The .tmp file is left on disk (not cleaned up by the downloader itself)
ModelDownloadError::Cancelled is returned

The frontend exposes cancellation via the persistent toast’s cancel button, which calls cancelModelDownload.

Verification

SHA-256 verification is streaming — the hash is computed incrementally as bytes are written to the .tmp file. There is no separate verification pass after download completes. This means:

Each chunk is written to disk AND fed to the Sha256 hasher
After the stream completes, hasher.finalize() produces the digest
The digest is compared against the embedded constant
On match: .tmp is atomically renamed to the final filename
On mismatch: .tmp is deleted, ChecksumMismatch error is returned

If a file already exists on disk, its SHA-256 is verified before skipping. If the existing file’s checksum does not match, it is deleted and re-downloaded.

Error Handling

The ModelDownloadError enum covers all failure modes:

Variant	Trigger
`Http`	Network failure, non-2xx response, stream read error
`Io`	Filesystem errors (create dir, write file, rename)
`ChecksumMismatch`	SHA-256 of downloaded file does not match expected
`Cancelled`	`CancellationToken` was triggered during download
`OversizedResponse`	Content-Length exceeds 800 MB safety limit

All error variants emit a status: "error" event before returning, so the frontend can clean up the progress toast and show an error message.

Storage Locations

Model files are stored in {storage_dir}/models/, resolved by AppPaths::embedding_model_dir():

Environment	Model Directory
Dev	`{project}/.data/models/`
Production	`~/Library/Application Support/Inklings/models/` (macOS)

The directory is created on first download via create_dir_all. Two files are expected:

models/
  model.onnx          # ~585 MB, ONNX FP16 embedding model
  tokenizer.json      # ~2 MB, tokenizer vocabulary

model_files_present() checks for both files by existence only (not checksum). The download flow performs full SHA-256 verification separately.

Frontend Integration

`useModelDownload` Hook

Located at apps/desktop/src-react/hooks/useModelDownload.ts. Mounted in App.tsx for the lifetime of the application.

Listens to embedding-model:download-progress events
Manages a persistent toast with progress bar, speed, and ETA
Provides a cancel button that calls cancelModelDownload
On completion, shows a transient “Semantic search ready” toast
On embedding-model:ready event, calls reloadEmbeddingPipeline to hot-swap the search pipeline

`SemanticSearchToggle` Component

Located at apps/desktop/src-react/components/settings/SemanticSearchToggle.tsx. Rendered in workspace settings.

Reads initial state from getSettings
Toggle ON: calls setSemanticSearchEnabled(true) then triggerModelDownload
Toggle OFF: calls setSemanticSearchEnabled(false) — no further action
Shows model size (~585 MB) in the description as informed consent

Design Decisions

No Resume Support

Downloads are atomic: if interrupted, the .tmp file is abandoned and the download restarts from scratch on next trigger. Resume (HTTP Range headers) adds complexity for a one-time download that typically completes in minutes.

Tokenizer-First Ordering

The tokenizer (~2 MB) downloads first, giving immediate visual feedback in the progress toast before the long model download begins. This avoids a perceived hang at the start.

Oversized Response Guard

A hardcoded 800 MB maximum (MAX_MODEL_SIZE_BYTES) rejects any response larger than expected. This guards against serving a corrupted or unexpected file from the CDN. The current model is ~585 MB, leaving headroom for minor size changes across model versions.

Streaming SHA-256

Computing the hash during download (not as a separate pass) halves the I/O for the ~585 MB model file. The hash is finalized in memory immediately after the stream completes.

Dev Tool

tools/dev/download-embedding-model.sh pre-downloads the model for CI and development environments. This avoids the auto-download path when running tests or building in environments without interactive UX.

Key Files

File	Role
`apps/desktop/src-tauri/src/model_downloader.rs`	Download logic, SHA-256 verification, progress events
`apps/desktop/src-tauri/src/embedding.rs`	`EmbeddingTask` background worker that uses the downloaded model
`apps/desktop/src-tauri/src/paths.rs`	`embedding_model_dir()` path resolution (dev vs production)
`crates/domain/src/settings.rs`	`semantic_search_enabled` setting (schema v11)
`apps/desktop/src-react/hooks/useModelDownload.ts`	Progress toast and cancellation UI
`apps/desktop/src-react/components/settings/SemanticSearchToggle.tsx`	Settings toggle component
`tools/dev/download-embedding-model.sh`	Dev/CI pre-download script

Previous
Frontend System Next
Permission System

Was this page helpful?