Pick your deployment profile.
Solo is the recommended default. All modes share the same vault and MCP stub — they differ in services started, LLM provider, and access pattern.
Three archives — download only what your deployment needs. Linux x86_64. SHA256 and SLSA provenance attestation per archive.
Contains gradatum-server, gradatum-worker, gradatum-admin.
Core vault store, job queue, admin CLI.
gradatum-server-v0.5.2-x86_64-unknown-linux-gnu.tar.gz
Contains gradatum-gateway and gradatum-engine.
Router, circuit-breaker, llama-server supervisor. Required for local inference.
gradatum-llm-v0.5.2-x86_64-unknown-linux-gnu.tar.gz
Contains gradatum-mcp-stub.
MCP bridge for agent clients (Claude Code, Claude Desktop, etc.).
gradatum-mcp-v0.5.2-x86_64-unknown-linux-gnu.tar.gz # verify SHA256 (example — server archive)
sha256sum -c SHA256SUMS --ignore-missing
# verify SLSA build provenance (GitHub native)
gh attestation verify \
gradatum-server-v0.5.2-x86_64-unknown-linux-gnu.tar.gz \
--repo gradatum/gradatum Source is open on GitHub (Apache-2.0). The crates.io name is reserved — the full embeddable Rust library ships at v1.0 when the API is stable.
crates.io/crates/gradatum · name reserved · full library at v1.0 · Apache-2.0
Clone the repository and compile all workspace crates. Requires recent stable Rust (MSRV).
git clone https://github.com/gradatum/gradatum
cd gradatum && cargo build --release --workspace github.com/gradatum/gradatum · Apache-2.0 · Rust 2021 · MSRV 1.88+
v0.5.2 · Linux x86_64 · API not stable before v1.0. See docs/DEPLOYMENT.md for the full deployment reference.
| Provider | Flag | Requirement |
|---|---|---|
| Ollama (local) | --provider ollama | Ollama installed and running |
| llama.cpp server | --provider llamacpp --llm-url URL | llama.cpp server accessible |
| OpenRouter | --provider openrouter --api-key KEY | OpenRouter API key |
| Anthropic | --provider anthropic --api-key KEY | Anthropic API key |
| OpenAI | --provider openai --api-key KEY | OpenAI API key |
| Custom OpenAI-compatible | --provider custom --llm-url URL | Accessible endpoint |
| None (Nano mode) | --level nano | — |
gradatum-admin init --preset hierarchical --root /var/lib/gradatum
gradatum-admin api-key create --root /var/lib/gradatum --owner mcp-stub
gradatum-admin api-key list --root /var/lib/gradatum
gradatum-admin jobs list --root /var/lib/gradatum
gradatum-admin jobs dlq --root /var/lib/gradatum
gradatum-admin vault rename "old title" "new title" # F-39 stable wikilinks Single-binary on one box, or scale out: one GPU host serving several models, an app host routing through the gateway with automatic CPU fallback.
consumers (apps · agents · MCP clients)
↓ MCP / HTTP / REST
┌──────────────────── app-host (Linux) ───────────────────────┐
│ gradatum-server ─┐ │
│ gradatum-worker ─┴──────────▶ gradatum-gateway :8436 │
│ (router · circuit-breaker) │
│ │ primary │ fallback │
│ ▼ ▼ │
│ [GPU-HOST] local CPU fallback │
└──────────────────────────────────┼────────────────────────┘
LAN
┌──────────────────── gpu-host (Linux) ───────────────────────┐
│ gradatum-engine · one supervisor binary, one instance/model│
│ chat :8083 · embed :8432 · small :8082 │
│ reason :8081 · vision :8080 (+mmproj) │
│ each instance supervises one llama-server child (loopback) │
│ GGUF bind-mounted ro · /opt/gradatum/models/ │
└─────────────────────────────────────────────────────────────┘
primary → GPU host · fallback → local CPU (circuit-breaker auto) Full deployment guide and configuration reference: docs/DEPLOYMENT.md.