Roadmap¶
Complete¶
Storage & Data Model¶
-
Entrydata model with structured metadata and type-specific extensions -
DistilleryStoreprotocol — async storage abstraction enabling backend migration - DuckDB backend with VSS extension and HNSW index (cosine similarity)
- Configurable embedding providers (Jina v3 default, OpenAI adapter)
- Embedding model lock via
_metatable — prevents mixed-model corruption - MCP server with 16 tools over stdio and streamable-HTTP (consolidated from 20 in
staging/api-hardening) -
distillery.yamlconfig system with validation
Core Skills¶
-
/distill— session knowledge capture with duplicate detection -
/recall— semantic search with provenance display -
/pour— multi-pass retrieval + structured synthesis with citations -
/bookmark— URL fetch, auto-summarize, store with dedup check -
/minutes— meeting notes with--update(append) and--listmodes - Shared
CONVENTIONS.md— author/project identification, error handling patterns
Classification Pipeline¶
-
ClassificationEngine— LLM prompt-based type assignment with confidence scoring -
DeduplicationChecker— skip/merge/link/create at configurable thresholds -
/classifyskill — classify by ID, batch inbox, review queue triage
Quality & Observability¶
- Implicit retrieval feedback + quality metrics (folded into
distillery_status) - Stale entry detection —
distillery_list(stale_days=…)(formerly thedistillery_staletool) - Conflict detection —
distillery_find_similar(conflict_check=true) - Usage metrics dashboard —
distillery_statustool (replaces formerdistillery_metrics)
Infrastructure¶
- FastMCP 2.x/3.x with
@server.tooldecorators - Hierarchical tag namespace with validation; tag inventory via
distillery_list(group_by="tag")(formerlydistillery_tag_tree) - 12 entry types including
person,project,digest,github,feed - Entry-type schema discovery via MCP resource
distillery://schemas/entry-types(replacesdistillery_type_schemas)
Team Access¶
- HTTP transport —
distillery-mcp --transport http - GitHub OAuth — team authentication via FastMCP
GitHubProvider - Prefect Horizon deployment (MotherDuck)
- Fly.io deployment with persistent DuckDB on volume
- Namespace taxonomy — hierarchical, validated tag system
Ambient Intelligence¶
-
/radar— interest-driven feed digest with AI source suggestions -
/watch— add/remove/list monitored feed sources -
/tune— adjust relevance thresholds and trust weights - Feed polling architecture —
FeedPollerwith configurable intervals - Source adapters — GitHub events (REST API) and RSS/Atom
- Relevance scoring pipeline — embedding-based cosine similarity
- Interest extractor — mines entries for tags, domains, repos, expertise
- Auto-tagging — source tags (
source/github/owner/repo,source/reddit/sub) and topic tags from KB vocabulary -
distillery retagCLI — backfill tags on existing feed entries
Search¶
- Hybrid BM25 + vector search — DuckDB FTS extension with Reciprocal Rank Fusion (RRF)
- Recency decay — configurable time-weighted scoring (90-day window, 0.5 min weight)
- Graceful degradation — falls back to vector-only if FTS extension unavailable
Team Skills¶
-
/digest— team activity summaries over configurable time windows -
/gh-sync— sync GitHub issues/PRs into the knowledge base as searchable entries -
/investigate— deep context builder with 4-phase retrieval and relationship traversal -
/briefing— knowledge dashboard with solo mode (5 sections) and team mode (8 sections)
Entry Relations & Corrections¶
-
entry_relationstable with backfill migration -
distillery_correcttool for structured corrections -
distillery_relationstool for managing entry links
New Entry Fields¶
-
expires_at— time-limited entries with UTC normalization -
verification— orthogonal quality tracking (Unverified, Testing, Verified) -
session_id— first-class field for session-scoped entries - Extended
EntrySource— added inference, documentation, external provenance values
Session Hooks¶
- Hook dispatcher script (
distillery-hooks.sh) — routes UserPromptSubmit, SessionStart, PreCompact - Memory nudge — periodic reminder to
/distillevery 30 prompts - SessionStart briefing — automatic context injection via HTTP MCP
- Scope-aware
/setuphook configuration — detects plugin install scope (user/project)
Onboarding¶
-
/setupskill — MCP connectivity wizard, auto-poll configuration, session hook setup - uvx-first setup —
uvx distillery-mcpas recommended first-time path
API Hardening (staging/api-hardening → released)¶
- API consolidation: 20 → 16 tools. Removed
distillery_aggregate,distillery_stale,distillery_tag_tree,distillery_metrics,distillery_interests,distillery_type_schemas,distillery_poll,distillery_rescore. Functionality folded intodistillery_list,distillery_status,distillery_configure, REST/api/maintenance, and thedistillery://schemas/entry-typesresource. - #244 — Bulk ingest pipeline: new
distillery_store_batchtool;/gh-syncruns as a server-side background job tracked bydistillery_sync_status - #245 — Hardened MCP interface: tool descriptions, structured error codes, schema validation, INVALID_PARAMS suggestions
- #232 —
distillery_storeenum includesgithub - #238 —
distillery_storeacceptsoutput_mode="summary"to skip dedup/conflict checks - #241 — label→tag sanitiser handles underscored labels
- #240 —
/gh-syncpasses validoutput_mode - #317 —
distillery_list/distillery_searchexclude archived entries by default - #311 —
distillery_listdefaultoutput_mode="summary" - #346, #347, #349 — DuckDB WAL recovery and FTS replay hardening
- #351 — Embedding budget default raised to unlimited; provider 429s surface to caller
Planned¶
P0 — Follow-up¶
- #112 — Security Review Follow-Up
P0 — Quality & Bugfixing¶
PRs go directly to main.
- #230 — DuckDB WAL corruption on unclean shutdown
- #236 — RateLimitMiddleware defaults starve local-client bursts
- #221 — FeedPoller poll cycle exceeds 5 minutes
- #169 —
distillery retagproduces no output - #235 — Plugin auto-registers hosted demo MCP
P0 — Memory Benchmarking¶
- #233 — LongMemEval retrieval benchmark
P1 — Near-term Features¶
- #199 —
distillery_extractfor PreCompact summarisation - #237 — Retrieval-hygiene conventions docs
- #212 — Slim down container image
- #163 — Relevance-sorted feed queries for /radar
- #152 —
/whoisskill - #151 —
/processskill - #149 — Access control (visibility flag)
Deferred¶
- #147, #142, #141, #140, #138, #158 — Graph analysis arc (NetworkX, hidden connections, epiphany generation)
- #167 — Slack conversation adapter
- #101 — Browser extension
- #93 — Public knowledge spaces for OSS projects
- #81 — Tauri desktop frontend
- LangGraph evaluation for complex skill orchestration
- Multi-team support and cross-team knowledge sharing
- Re-embedding migration tooling
Technology Stack¶
| Layer | Current | Planned |
|---|---|---|
| Interface | Claude Code skills | Same |
| Transport | stdio + streamable-HTTP | Same |
| Auth | GitHub OAuth (FastMCP) | + multi-team RBAC |
| Storage | DuckDB + VSS + FTS / MotherDuck | Same |
| Search | Hybrid BM25 + vector (RRF) | + score normalization |
| Embeddings | Jina v3 / OpenAI | Same |
| Language | Python 3.11+ | Same |
| Hosting | Local / Fly.io / Prefect Horizon | Same |