Ingest: Structure-Guided Document Chunking
Turns external documents (PDF, HTML, DOCX, Markdown) into indexed, searchable notes — preserving structural boundaries instead of splitting arbitrarily.
A skeleton tree parses the document hierarchy; each logical section becomes one atomic note, so a clause is never split across two chunks.
Table extraction converts tabular rows into indexable sentences (RowToSentence) so column relationships survive retrieval.
A noise filter discards non-informative nodes (table of contents, executive summaries, glossaries) that degrade retrieval quality.
Ingestion runs as a background Job::Ingest with progress tracking and automatic retry via the job queue.
For: Teams that need to import technical documentation, research papers, or meeting transcripts into the vault without losing the structural context that makes answers accurate.
Stable Wikilinks: Persistent Cross-Note Graph
Makes wikilink references between notes durable — links survive renames and deletions by anchoring to a stable ULID identifier rather than a file path.
A redirect_table maps every past title or path to the canonical ULID anchor, so [[old-title]] resolves correctly after the note is renamed.
Backlinks are indexed at write time, exposing links_in/links_out on every search result for graph-neighborhood traversal.
vault_search_multi merges results from multiple queries using Reciprocal Rank Fusion (RRF), so wikilink clusters surface naturally alongside keyword matches.
For: Knowledge engineers and developers who build interconnected notes and need cross-references to remain valid as the vault evolves over time.
Note History: Copy-on-Write Version Trail
Records a complete version history for every note write, so any previous state of a note can be retrieved or diffed without external tooling.
On each vault_write, the previous version is stored in a .history/<ulid>/ directory before the note is updated (Copy-on-Write semantics).
Dedicated history/* endpoints let callers inspect or restore any prior version by timestamp or sequence number.
Lifecycle configuration caps storage at max_versions=50 per note, pruning the oldest revisions automatically.
For: Developers and teams who need an audit trail of how knowledge evolved — including distillation drift detection and rollback of accidental overwrites.
Optimistic Locking: Safe Concurrent Writes
Prevents silent data loss when two processes write the same note concurrently, using a content-hash check instead of pessimistic locks.
vault_write accepts an optional write_if_match parameter containing the SHA-256 hash of the note version the caller last read.
If the stored note has changed since that read, the server returns 409 Conflict with a WriteConflict descriptor — the caller decides how to merge.
Writes without a hash succeed unconditionally, preserving backward compatibility for append-only workflows.
For: Developers building multi-agent pipelines or concurrent writer workflows where two agents may update the same note within the same time window.
Provenance Trust Score: Verifiable Note Lineage
Attaches a computed trust score to every note, derived from its origin and distillation history, so retrieved content carries verifiable lineage.
Trust is calculated from four sources: the writing agent, the distillation chain, the number of corroborating notes, and the confidence score at ingestion.
Distilled notes inherit a weighted mean of their source trust scores multiplied by the distillation confidence.
The trust field integrates with temporal decay (F-17): notes from less-trusted provenance decay faster in search ranking.
For: Teams running multi-agent pipelines who need to distinguish high-confidence knowledge from speculative or low-provenance notes before acting on retrieved content.
Temporal Decay: Recency-Weighted Search Ranking
Makes search results reflect how fresh and still-valid each note is — older or expired content scores lower without being deleted.
Each note carries a validity state (valid, temporal, or expired) and a document kind (static, versioned, or event), which together determine its decay profile.
A recency score computed relative to the current date is blended with the semantic score using a configurable temporal weight (default 0.40).
Event notes use a raw cosine relevance gate before decay is applied, preventing stale event records from surfacing on unrelated queries.
For: Agents and search clients that need results biased toward current knowledge — particularly useful for decision logs, meeting notes, and time-sensitive technical documentation.
Event-Log Vault: LLM Cost Attribution
Records every LLM call made by the vault with model, token count, estimated cost, latency, and the feature that triggered it — giving full budget visibility per feature.
A QaEvent struct is captured by the gateway intercept layer at each LLM completion: model identifier, prompt/completion token counts, cost estimate, latency, and a feature_id tag.
Events are stored in an append-only event_log table, queryable via the jobs introspection API for per-feature cost breakdowns.
The event log feeds the distillation learn job (F-22), which uses token patterns to identify cost-optimization candidates over time.
For: Operators who need to understand which vault features drive LLM spend, and developers building cost-attribution dashboards or budget-alert workflows on top of the vault.
Privacy Filter: Automatic PII Redaction on Write
Intercepts every note before it is stored and redacts personally identifiable information (PII) — without an external API call or network dependency.
The PrivacyFilter is registered as a WriteHook on DocumentStore; it runs synchronously on every vault_write before the note reaches the index.
The initial textual pass uses heuristic pattern matching. A later ONNX (portable neural network runtime) Named Entity Recognition model runs fully on-device with no data leaving the host.
Filtered fields are marked in the note frontmatter so downstream processes know redaction has occurred.
For: Developers ingesting documents that may contain personal data — emails, transcripts, HR notes — and operators who need a compliance-friendly vault with no third-party data processing.
Drift Detection: Identity Write Hook
Detects unauthorized or unexpected changes to an agent's identity notes and flags them before they silently alter agent behavior.
A DriftDetector is registered as a WriteHook on DocumentStore at startup; it monitors writes to the identity/ locus without a direct vault dependency.
On each write, the detector computes a semantic distance between the new content and the last validated version; divergence above threshold triggers a drift_detected event.
The drift event is surfaced via the jobs SSE (Server-Sent Events) stream, allowing operators or higher-level workflows to review the change before it takes effect.
For: Operators running persistent agents whose identity notes must not change without explicit authorization — detecting accidental overwrites and adversarial prompt-injection attempts.
VaultScope: Multi-Vault and Multi-Agent Addressing
Introduces a single addressing type that targets any locus across multiple vaults and agents with a single, unambiguous address — usable from any background job without per-job workarounds.
VaultScope encodes the vault identifier, the agent identifier, and the locus path as a single composable value, eliminating ambiguity when multiple vaults share a worker.
Every background job (distillation, purge, audit, migration, and others) carries a VaultScope, making cross-vault operations a first-class primitive rather than a per-job workaround.
All existing jobs are migrated to use VaultScope in a single coordinated change — no incremental per-job migration is needed.
For: Developers building multi-agent systems where several agents share or exchange knowledge across isolated vaults, and need deterministic addressing for background jobs.
Vault Lifecycle Management: State Machine, Retention and History Pruning
Adds an explicit note lifecycle state machine (Draft → PendingReview → Live → Deprecated → Garbage) and declarative retention rules — keeping the vault compact and high-quality automatically.
Each note carries a lifecycle_state field; transitions are explicit API calls with optional guard conditions so no note skips a required validation step. Draft notes are excluded from search by default; Deprecated notes are downweighted.
Operators define [[vault.lifecycle]] rules in TOML: conditions such as age, decay score, or locus pattern trigger a Job::Purge(Lifecycle). The purge job runs after distillation so no note is deleted before its value has been extracted.
Configurable history pruning caps per-note version history with max_versions and a TTL, preventing unbounded growth of the .history/ directories.
For: Operators who want the vault to self-regulate quality over time, and teams building multi-agent pipelines that need a formal quality gate between note production and retrieval.
Semantic Forget: Intentional Scoped Deletion
Lets operators explicitly remove a topic or locus from the vault — with a mandatory dry-run preview, double confirmation, and progressive decay instead of immediate deletion.
vault_forget(scope, dry_run: true) returns the full list of affected notes and any derived skills before any state change — the operator reviews and confirms explicitly.
On confirmed deletion, notes are marked forgotten=true and decay accelerates over a configurable window, removing them from search results progressively rather than immediately.
Cascade behavior is configurable: forgetting a knowledge/ topic can optionally propagate to linked skills/ and peers/ entries derived from it, with each cascade step listed in the dry-run preview.
For: Teams or individuals removing a project, topic, or person from the vault intentionally — with full visibility into what will be affected before committing, and a decay window to undo.
Temporal Index Foundation: Chronological Memory Queries
Lays the foundation for time-aware vault queries — a chronological index from note frontmatter lets agents ask what happened before, after, or around a date without a calendar or graph database.
A TemporalIndex is derived at write time from frontmatter fields (occurred_at, valid_from, event-date, created) — no LLM extraction, no separate store. This release ships the index and the vault_timeline API surface; higher-level temporal reasoning ships in v0.5.0.
The vault_timeline tool exposes before/after/around/upcoming queries; the index is fully reconstructible via a ReIndex job if frontmatter changes.
Job::Validate cross-checks temporal contradictions between notes (e.g., two notes asserting conflicting event orders) as part of the memory validation pipeline.
For: Agents and developers who need to reconstruct decision timelines, detect sequencing contradictions, or surface upcoming-deadline notes — without adding a calendar or graph infrastructure.
Distill: Scheduled Knowledge Compression
Automatically compresses accumulated raw notes into compact, reusable knowledge — running as scheduled background jobs while the vault is idle.
Four distillation modes run as Job::Distill: Semantic (synthesizes topic clusters into a single knowledge note), Learn (requires enough recorded LLM interactions (QaEvents) to extract meaningful cost and quality patterns), Peer (requires ≥5 sessions, builds a user behavior profile), and Rationale (preserves the reasoning chain behind decisions).
Distilled notes inherit a trust score computed from their source notes, and a Note History fingerprint is stored so drift from the validated version can be detected later.
DistillSource supports multi-vault targeting via VaultScope, allowing a distillation job to draw from notes across isolated vaults.
For: Developers running long-lived agent sessions who want raw notes compressed into searchable knowledge automatically — and teams building shared knowledge stores that grow in quality over time.
Lessons Recall: Dedicated Endpoint, MCP Tool, and Hook
Surfaces distilled lessons-learned notes on demand via a dedicated recall endpoint, a native MCP tool, and an agent hook — making accumulated lessons actionable at decision time.
A dedicated GET /api/v1/lessons/recall endpoint queries the lessons-learned corpus with semantic search and returns ranked results with source attribution.
A vault_lessons_recall MCP tool exposes the same surface directly to MCP clients, with optional role and tag filters so agents retrieve only domain-relevant lessons.
A pre-action hook fires automatically when the agent is about to start a new task, injecting the top-3 matching lessons into the context before the first response.
For: Developers and agents who want past mistakes and validated patterns surfaced automatically before acting — not just stored somewhere and manually searched.
Multimodal Gateway: OpenAI Content-Array and Vision Routing
Extends the gateway to accept the OpenAI content-array message format and route vision requests to an appropriate model — enabling multimodal inputs without changing the vault API.
The gateway parses ChatMessage::User as either a plain string or a Vec<ContentPart> (text + image_url), matching the OpenAI chat completions schema, so existing text-only clients are unaffected.
A vision routing gate inspects the content array at request time: messages containing image parts are forwarded to a configured vision-capable endpoint; text-only messages follow the standard routing path.
Configuration exposes a vision_endpoint field in the gateway TOML; if unset, image-bearing requests return a 422 with an explicit error rather than silently stripping the image.
For: Developers building agents that process screenshots, diagrams, or documents alongside text, and operators who want multimodal inputs handled at the gateway layer without routing logic in each client.
libsql Remote: Replicated SQLite Backend
Adds an opt-in remote SQLite backend powered by libsql, enabling vault replication and read replicas without changing the DocumentStore interface.
The gradatum-db-sqlite crate exposes a libsql feature flag; enabling it replaces the local SQLite connection with a remote libsql endpoint.
The DocumentStore, IndexStore, and QueueStore traits remain unchanged — the swap is purely a configuration-level choice with no application code changes.
An internal poll loop on the libsql queue backend (500 ms interval) bridges the remote queue to the existing channel-based worker dispatch.
For: Operators who need vault data replicated across machines or want read replicas for high-read workloads, without migrating to a heavier database engine.
LanceDB Vector Backend: Scalable Embedding Store
Replaces the default SQLite vector store with LanceDB for workloads where Approximate Nearest Neighbor (ANN) search on large embedding sets becomes a bottleneck.
The gradatum-db-lancedb crate implements the VectorStore trait backed by LanceDB, an embedded columnar vector database.
Switching is opt-in via configuration — IndexStore (full-text) stays on SQLite; only the vector path moves to LanceDB, keeping the deployment simple.
A Parquet-backed DocStore variant (planned for a later phase) will extend LanceDB to cover document storage as well for very large vaults.
For: Developers with vaults exceeding tens of thousands of notes who find SQLite ANN performance insufficient, and contributors who want to benchmark retrieval quality across storage backends.
gradatum-studio: Vault Management Interface
A local web interface for reading, searching, reviewing, and monitoring the vault — without exposing any API key or modifying the vault write path.
Five surfaces ship in the MVP: a dashboard (live vault metrics and recent activity), a note browser with inline markdown rendering, a search panel with score and section filters, a review queue (notes pending lifecycle validation), and a jobs monitor showing background task status.
Authentication is handled via a single API key injected at startup; no OAuth or user database is required for the single-user deployment.
A WHY scores panel surfaces the curator confidence and section classification for each note — making the reasoning behind search ranking visible and auditable.
The interface is read-only for vault content by design; writes still go through the standard API so the vault write path remains the sole source of truth.
For: Developers and operators who want to inspect and monitor their vault through a browser rather than raw API calls, without adding infrastructure or weakening data sovereignty.
gradatum-mcp: Native Model Context Protocol Server
Exposes the full vault API as a native MCP (Model Context Protocol) server — usable from any MCP-compatible client without an intermediary stub.
The gradatum-mcp crate publishes vault tools (vault_write, vault_search, vault_forget, vault_timeline, and others) as MCP capabilities with full JSON Schema definitions.
Authentication is handled MCP-side, decoupled from the HTTP API auth layer, so the MCP surface has its own access control.
The stdio transport (for local clients) and the Streamable HTTP transport (F-56, for remote clients) are both supported from the same crate.
For: Developers using MCP-compatible LLM clients (Claude Desktop, Cursor, custom agents) who want direct vault access without installing a local proxy stub.
Streamable HTTP Transport: Load-Balancer-Friendly MCP
Implements the MCP 2025-11-25 Streamable HTTP transport — a single /mcp endpoint that works with load balancers, serverless runtimes, and mobile clients.
A single POST+GET /mcp endpoint handles all MCP traffic; responses are either plain JSON or upgrade to Server-Sent Events (SSE) per-request, without maintaining a persistent connection.
This replaces the deprecated HTTP+SSE transport (spec 2024-11-05), which required a persistent SSE connection incompatible with most load balancers.
The local stdio transport is preserved for desktop clients; optional backward-compatible SSE mode allows a smooth migration for existing integrations.
For: Operators deploying gradatum behind a reverse proxy or in a containerized environment, and mobile MCP clients (Claude for iOS/Android) that require a stateless HTTP transport.
Memory Validation: Self-Healing Before Storage
Intercepts distilled notes before they enter long-term memory, corrects detectable errors automatically, and discards notes that cannot be repaired.
A background validation job computes a composite quality score; notes above a configurable threshold are accepted, notes with specific error patterns are routed to a repair strategy.
Three repair strategies: contradiction patch (corrects numeric contradictions against source notes), entity scrub (removes hallucinated entity claims), and grounding rewrite (reconstructs under-anchored text from source material). Internally: ContradictionPatch, EntityScrub, GroundingRewrite.
Repaired notes are stored with an audit flag (HEALED_ACCEPT) and a change log; notes that cannot be repaired are discarded cleanly — never silently stored.
For: Teams where distillation quality is critical — RAG pipelines, shared knowledge bases, long-running agents — who cannot afford hallucinated or contradictory notes accumulating in the vault.
Multi-User Support: Per-User Vault Isolation
Enables multiple users to share a single gradatum deployment with configurable isolation — private identity, shared or private knowledge, role-based access.
Each user is represented by a UserRecord with a Bearer JWT; an admin invitation flow provisions new users without exposing the root credential.
Isolation is per-locus type: identity/ and peers/ are always private; knowledge/ and skills/ are configurable as shared or private per deployment.
ACL policies in gradatum-acl-policy enforce locus-level access at the storage layer, so isolation is structural rather than enforced only at the API boundary.
For: Teams and households who want to run one gradatum instance shared across multiple people or agents, each with their own private memory and optionally contributing to shared knowledge.
OAuth MCP: Remote Access for Mobile and ChatGPT
Enables gradatum to be reached from mobile MCP clients and ChatGPT without weakening sovereignty — using a self-hosted OAuth 2.1 authorization server.
Gradatum acts as an OAuth 2.1 resource server: it validates tokens and publishes Protected Resource Metadata (RFC 9728) but delegates token issuance to a self-hosted identity provider using OIDC (OpenID Connect) — such as Kanidm.
The IdentityProvider trait decouples the identity provider from gradatum-auth (D-14), so the IdP is replaceable without modifying the authorization layer.
PKCE (Proof Key for Code Exchange) S256, Dynamic Client Registration, and explicit consent flows are required — bearer-static tokens are not accepted, matching what Claude for mobile and ChatGPT require.
For: Operators who want to reach their vault from a mobile MCP client or ChatGPT without a VPN, and who want token rotation, explicit consent, and centralized revocation instead of static bearer tokens.
Vault Audit and Deduplication: Scheduled Quality Pass
Runs a scheduled audit pass over the vault to detect duplicate notes, score the vault's overall knowledge quality, and produce a conflict report.
Job::Audit(AuditMode) supports three modes: Detect (identifies duplicates and near-duplicates by semantic similarity), Deduplicate (merges or flags them), and Both (full pass).
The audit produces a per-locus vault score reflecting coverage, freshness, and uniqueness — visible in the jobs introspection API.
Conflict reports list notes with contradictory claims so operators or distillation jobs can resolve them explicitly rather than leaving ambiguity in the search index.
For: Operators maintaining long-lived vaults where notes accumulate from multiple agents or ingestion pipelines, and who need a systematic quality baseline rather than ad-hoc manual review.