Skip to content
Documentation GitHub
Data Flow

Import

Status: Accepted Reference epics: INK-826, INK-830 ADRs: ADR-017, ADR-018

How external markdown files are analyzed, converted, and imported into an Inklings workspace. Every page created by an import run is a WorldWrite with origin: Imported and lifecycle: Draft. The import adapter is one caller of the submit boundary; it does not bypass it.


Imports are writes. The same rule that governs an editor save and an agent tool call governs an import run: the domain only accepts a modification that is expressed as a WorldWrite. See ADR-017.

The import adapter contributes two fixed defaults that distinguish import writes from other callers:

FieldValue on importWhy
originImportedThe content came from outside the workspace. The fact is permanent and does not shift as the content is later edited. See provenance §origin.
lifecycleDraftThe user chose to import, not to endorse. Canonicalization is a separate, explicit act. See provenance §lifecycle and ADR-018.

The Draft default is uniform across source types. Source “authoritativeness” (an Obsidian vault the user has been tending for years, an AnyType export from another tool) does not alter the default. The user’s import click is permission to ingest, not to canonicalize.

A user who wants to accept their import wholesale uses the bulk-canonicalize affordance described below. That is an explicit, separate act performed after the import completes.


The write-path itself — what the submit boundary does with a WorldWrite, what side effects fire, and the per-row transaction shape — is covered in data-flow/write-path. This page covers the import-specific shape: how the adapter produces each WorldWrite, what lifecycle it carries, and the conflict and bulk-canonicalize behaviors layered on top.


Code: crates/application/src/import/analyze.rs, crates/infrastructure/import/src/import/import_repository.rs

The analysis phase is a read-only scan and produces no writes. The user selects a source folder; the frontend calls analyze_import. AnalyzeImportUseCase requires the ImportExecute capability, validates the path is a directory, then delegates to MarkdownImporter::analyze.

The importer checks markers in this order:

  1. .obsidian/ directory → Obsidian Vault (activates wiki-link conversion, loads Obsidian config)
  2. objects/ + types/ directories → AnyType Export (scans only objects/, converts markdown links, extracts AnyType metadata)
  3. Fallback → Markdown Folder

Obsidian is the more direct import target; AnyType is supported for users migrating off another tool.

ItemTreatment
.git, .obsidian, node_modules, .vscode, .idea, __pycache__Skipped entirely
Hidden directories (starting with .)Skipped
SymlinksSkipped (security: prevents path escape)
Non-.md filesCounted but not imported
Obsidian user ignore filtersApplied in addition to built-in rules

The scan builds a FolderPreview tree showing the top-level folder structure, and detects potential ImportIssue types: invalid frontmatter, paths exceeding max depth, duplicate slugs, and suspicious path components.

Each skipped folder is recorded as an ExcludedFolder with an ExclusionReason (BuiltInIgnored, HiddenDirectory, ObsidianUserFilter, AttachmentFolder, AnyTypeMetadataDir, or AnyTypeTemplateDir). These are returned in source-specific metadata (ObsidianMetadata.excluded_folders or AnyTypeMetadata.excluded_folders) so the import preview dialog can show users exactly which folders were excluded and why, grouped by reason in a collapsible section.

For AnyType exports, the importer scans only the objects/ directory for content files. Metadata directories (types/, relations/, relationsOptions/, schemas/, filesObjects/) and templates/ are excluded. AnyTypeMetadata is populated with: suggested workspace name (from folder name), a type summary (object type names and counts), relation count, and excluded folder details. Attachment files are discovered from the files/ directory.


Stage 2 — Execution (Per-File WorldWrites)

Section titled “Stage 2 — Execution (Per-File WorldWrites)”

Code: crates/application/src/import/execute.rs, crates/infrastructure/import/src/import/import_repository.rs

ExecuteImportUseCase::execute_with_progress delegates to MarkdownImporter::execute_with_progress. For each discovered .md file the importer performs the per-file conversion below, then hands the converted content to the WorldWrite adapter. The adapter — not the importer — is what the submit boundary sees.

The file is read from disk. YAML frontmatter (delimited by ---) is extracted and parsed. Recognized frontmatter keys (e.g., tags, type) are mapped to Inklings metadata.

Section titled “Step 2 — Link conversion (Obsidian and AnyType sources)”

Link syntax differs between source types. The importer converts all link forms to Inklings’s [[Display|slug]] format:

Obsidian sources:

Input (Obsidian)Output (Inklings)Notes
[[Target Page]][[Target Page|target-page]]Display = title, slug = slugified title
[[Target Page|Display Text]][[Display Text|target-page]]Alias becomes display; slug from target
[[Target Page#Heading]][[Target Page|target-page#heading]]Heading fragment preserved
[[Folder/Target Page]][[Target Page|target-page]]Folder prefix stripped

AnyType sources:

Input (AnyType)Output (Inklings)Notes
[Kael Stormblade](Kael_Stormblade.md)[[Kael Stormblade|kael-stormblade]]Markdown link to wiki-link
![Map](files/realm_map.png)![Map](files/realm_map.png)Image refs preserved as-is
[Google](https://example.com)[Google](https://example.com)External URLs preserved unchanged

AnyType frontmatter object relations (values ending in .md) are also converted to wiki-links and appended as a “Relations” section in the page body.

Plain markdown sources (non-Obsidian, non-AnyType) have their links passed through unchanged.

The page slug is derived from the file’s relative path within the source directory. Path separators become hyphens; non-alphanumeric characters are stripped or replaced. Intra-run duplicate slugs use a numeric suffix (-2, -3, etc.). Collisions with existing canonical content are governed by the conflict strategy below.

For each file the adapter constructs a WorldWrite with the fields fixed by the import origin:

FieldValueSource
originImportedFixed by adapter; cannot be overridden by the caller.
lifecycleDraftFixed by adapter; cannot be overridden by the caller. Elevating to Canonical happens after the run via bulk-canonicalize.
origin_source_idImport run id + source file pathTies every Draft back to the run that produced it. Used for undo, bulk-canonicalize, and re-import reconciliation.
contentConverted markdown + extracted metadataOutput of Steps 1–3.
parent_slugResolved from relative file pathPreserves folder structure as page hierarchy.
derivation_sourcesNoneAn import is not a derivation from existing workspace content. Derivation only applies to agent or editor writes that cite workspace sources. See derivation-links.

The adapter then hands the WorldWrite to the submit boundary. The boundary validates, applies, and fires side effects as described in data-flow/write-path §stage 4. The importer does not call CreatePageUseCase directly; that path is behind the boundary.

progress_cb(ImportProgress { phase: Executing, completed, total }) is called after each file. The frontend displays this in real time via the import dialog. The progress stream also reports the running count of deviation records produced by Overwrite-mode collisions, so the user sees conflicts accumulate rather than encountering them only at the end.


Slug collisions between an import row and existing canonical content are resolved by the import-time conflict mode chosen in the dialog. All three modes produce well-formed WorldWrites; they differ in what the write does when a canonical page already occupies the target slug.

StrategyBehaviorSubmit-boundary effect
Skip (default)Existing page is left unchanged; the import row is counted in skipped_count.No WorldWrite is submitted for the skipped row.
RenameImported page gets a numeric suffix on the slug (-2, -3, …).A single WorldWrite is submitted at the new slug; no conflict at the boundary.
OverwriteExisting page is overwritten with the imported content.A WorldWrite is submitted at the conflicting slug. Because the existing content is Canonical and the incoming content differs, the boundary produces a DeviationRecord per domain rule 5. The overwrite is not silent.

The Overwrite behavior is the important case. In the pre-world-model importer, Overwrite mutated canonical content with no epistemic trace. After the submit-boundary rewrite, every Overwrite-mode collision produces a deviation record with type: agent-answer-vs-canonical (importer is an external caller with no agent identity, but the conflict type is the same: a non-authored write displacing canonical content). The deviation is visible in the author’s triage inbox and can be resolved, dismissed, or rolled back there. See deviation-records §always-produce: every conflict produces a record. No judgment in the pipeline.

Rename is the recommended default when the user wants the import to be non-destructive. Skip is the safe default when re-running an import against a workspace that has been partially imported before.


Because every imported page lands as lifecycle: Draft, an import does not change what is canonical in the workspace until the author acts. The import dialog’s post-run summary offers a bulk-canonicalize affordance so a user who has just imported an authoritative vault can elevate the whole run (or a subset) in one step.

Scope. Bulk-canonicalize operates on pages written by a specific import run, keyed by origin_source_id. A user can:

  • Canonicalize the entire run.
  • Canonicalize a folder subtree.
  • Canonicalize a selection made from the post-run summary list.

Mechanism. Each elevation is itself a WorldWrite — a lifecycle transition submitted by the author, moving the target page to lifecycle: Canonical. The page’s stored origin remains Imported; origin does not shift when lifecycle does. The elevation is attributable to the author via the event log (the WorldWrite carries author identity), but the fact that the content entered the workspace as an import is permanent. See provenance §invariants: elevating lifecycle does not rewrite origin.

Default. No bulk action runs by default. The dialog presents the affordance; the user chooses. This preserves the D19 stance: the user chose to import, not to endorse. Canonicalization is a separate act.

Undo. Bulk-canonicalize is undoable for the duration of the run’s post-import session. After that, individual pages can be returned to Draft via the lifecycle menu on each page, which is also a WorldWrite.


Files are always copied from the source folder to a new workspace at the specified target path. The source folder remains unchanged. The importer does not edit the source; it is a one-way ingest.


Each WorldWrite produced by the import run fires the same WriteEffectCoordinator side effects as any other write — see data-flow/write-path §stage 4 for the full pipeline:

  • Event log: EntityType::Page / EventType::Created recorded, with origin: Imported carried on the event row.
  • Embedding pipeline: Each Draft page is queued for vectorization (deferred, debounced). Draft pages are embedded on the same schedule as Canonical pages; lifecycle does not gate embedding.
  • Sync queue: Block update enqueued if sync is enabled.
  • FTS5: Automatically updated via SQLite trigger on raw_markdown.
  • Re-validation flags: Imports produce no derivation links (a fresh import is not derived from workspace content), so no re-validation flags are raised on other pages. A later bulk-canonicalize may produce re-validation flags on pages that had been derived from the Draft while it was still Draft — see retroactive-revision.

The importer itself fires none of these. It constructs the WorldWrite and submits it; the coordinator does the rest.


If the import is cancelled mid-execution, MarkdownImporter::rollback_import deletes only files confirmed to be within the workspace directory (canonicalized path check prevents escape). Pages already created through the submit boundary are not rolled back automatically — the Tauri command layer issues compensating WorldWrite operations (lifecycle transitions to Retired, not hard deletes) against the pages keyed by the run’s origin_source_id. This keeps rollback uniform with every other retire path in the system: a retired page still carries its origin and its history; it is not scrubbed from the event log.

Fully discarding a cancelled or unwanted import run — hard-deleting rather than retiring — is a separate administrative action, performed against the same origin_source_id, and is out of scope for the import flow itself.


FailureBehavior
Source path does not existSourceNotFound error before any I/O
Source is a file, not a directoryInvalidSource error
Missing ImportExecute capabilityImportFailed error; no deviation record per domain rule 7
Invalid frontmatterImportIssue::InvalidFrontmatter recorded in analysis; file imported with frontmatter stripped
Unresolvable wiki-link targetLink converted with best-effort slug; marked as ghost link at render time
Intra-run duplicate slugSuffix appended (-2, -3, …); not an error
Canonical-slug collision (Overwrite mode)WorldWrite applied + DeviationRecord produced; row counted in deviations
Canonical-slug collision (Skip mode)No WorldWrite submitted; row counted in skipped_count
WorldWrite construction fails (malformed content)DomainError; file counted in skipped_count; error added to ImportResult.errors
Submit-boundary validation rejects the writeDomainError; file counted in skipped_count; error added to ImportResult.errors
Symlink in sourceSkipped silently during scan
Path traversal attemptSkipped with warn!; path marked as suspicious in analysis issues

Errors during an import run do not abort the run. Each row is independent; failures accumulate in ImportResult.errors and are surfaced in the post-run summary.


Was this page helpful?