Skip to content

Architecture

Distillery is built as a 4-layer system where skills (SKILL.md files) drive all user interaction, the MCP server mediates all storage access, and backends are swappable through typed Protocol interfaces.

14 Claude Code Skills /distill /recall /pour /bookmark /minutes /classify /watch /radar /tune /setup /digest /gh-sync /investigate /briefing MCP Server FastMCP 2.x/3.x · stdio + streamable-HTTP · tools · REST webhooks (/api/*) GitHub OAuth OrgRestrictedGitHubProvider Middleware · Budget · Rate limits Core Protocols DistilleryStore · EmbeddingProvider Typed Protocol interfaces (async) Feed System GitHub · RSS/Atom · Auto-tagging Poller · Scorer · Interests DuckDB + VSS + FTS HNSW + BM25 hybrid (RRF) Vector + keyword search Embedding Jina v3 / OpenAI Configurable provider Classification LLM engine + Dedup Conflicts + Tag validation Config distillery.yaml Security · Validation 12 Entry Types session bookmark minutes meeting reference idea inbox person project digest github feed Dedup Thresholds skip >= 0.95 merge >= 0.80 link >= 0.60 unique < 0.60 Hierarchical Tags project/distillery/sessions · domain/storage · source/bookmark/duckdb-org · team/distillery

Layers

Layer What it does Key files
Skills 14 SKILL.md files — portable, version-controlled slash commands. Not Python code. skills/*/SKILL.md
MCP Server Tools exposed over stdio (local) or streamable-HTTP (team). Built on FastMCP 2.x/3.x with @server.tool decorators. src/distillery/mcp/server.py
Webhook API REST endpoints mounted under /api/* for orchestrated operations. The active endpoint is POST /api/maintenance (full poll → rescore → classify-batch pipeline). Individual scheduling endpoints — POST /api/poll, POST /api/rescore, POST /api/classify-batch (also reachable at POST /api/hooks/poll, /api/hooks/rescore, /api/hooks/classify-batch during the deprecation window) — are deprecated in favour of Claude Code routines and the orchestrated maintenance endpoint. Bearer token auth, per-endpoint cooldowns persisted to DuckDB. Mounted alongside MCP in HTTP mode. src/distillery/mcp/webhooks.py
Auth MCP: GitHub OAuth with org-restricted access. Webhooks: bearer token via DISTILLERY_WEBHOOK_SECRET. Middleware handles logging, rate limiting, security headers, budget tracking. src/distillery/mcp/auth.py, middleware.py, budget.py
Core Protocols Typed Protocol interfaces (structural subtyping, not ABCs). All storage operations are async. Includes query_audit_log for audit data access. src/distillery/store/protocol.py, embedding/protocol.py
Feeds GitHub events and RSS/Atom polling. Authenticated via GITHUB_TOKEN for private repos. Auto-tagging (source + topic tags from KB vocabulary). Relevance scoring via embeddings. Interest extraction for source suggestions. src/distillery/feeds/
Backends DuckDB + VSS (HNSW) + FTS (BM25). Hybrid search with RRF fusion and recency decay. Jina v3 / OpenAI embeddings. LLM classification with dedup + conflict detection. src/distillery/store/duckdb.py, embedding/, classification/

Key Design Decisions

Skills are SKILL.md files, not Python code. They are portable, version-controlled, and team-shareable. Claude Code loads the markdown and follows the instructions — no compilation or import required.

MCP server is the primary runtime interface. All storage access goes through the protocol, over stdio (local) or HTTP (team). Skills never access the database directly. REST webhook endpoints provide a secondary interface for automated scheduling — they share the same DuckDB store and run in the same uvicorn process.

Storage abstraction via DistilleryStore protocol. Enables future migration to Elasticsearch without rewriting skills or the MCP server.

Configurable embedding providers. Swap between Jina v3, OpenAI, or a zero-vector stub for testing via distillery.yaml.

Semantic deduplication. Prevents knowledge base pollution with configurable thresholds:

Threshold Default Action
Skip 0.95 Near-duplicate — don't store
Merge 0.80 Similar enough to combine
Link 0.60 Related — store with cross-reference
Below 0.60 Unique — store normally

Classification with confidence scoring. LLM-based type assignment with a team review queue for low-confidence results (below the configurable confidence_threshold, default: 60%).

Core Data Model

The Entry dataclass (src/distillery/models.py) is the fundamental unit of knowledge:

Field Type Description
id str (UUID4) Unique identifier
content str The knowledge content
entry_type EntryType session, bookmark, minutes, meeting, reference, idea, inbox, person, project, digest, github, feed
source EntrySource claude_code, manual, import, inference, documentation, external
status EntryStatus active, pending_review, archived
tags list[str] Hierarchical tags (project/distillery/decisions)
metadata dict Type-specific fields (validated per entry type)
version int Incremented on updates
author str Who created the entry
project str | None Which project context
created_at datetime Creation timestamp
session_id str | None Session grouping identifier
verification VerificationStatus Unverified, Testing, or Verified
expires_at datetime | None Optional expiration timestamp
updated_at datetime Last modification

Project Structure

distillery/
├── skills/                  # Claude Code skill definitions (loaded via plugin)
│   ├── distill/SKILL.md
│   ├── recall/SKILL.md
│   ├── pour/SKILL.md
│   ├── bookmark/SKILL.md
│   ├── minutes/SKILL.md
│   ├── classify/SKILL.md
│   ├── watch/SKILL.md
│   ├── radar/SKILL.md
│   ├── tune/SKILL.md
│   ├── setup/SKILL.md
│   ├── digest/SKILL.md
│   ├── gh-sync/SKILL.md
│   ├── investigate/SKILL.md
│   ├── briefing/SKILL.md
│   └── CONVENTIONS.md
├── src/distillery/
│   ├── models.py            # Entry, SearchResult, enums
│   ├── config.py            # YAML config loading
│   ├── security.py          # Input sanitization and content validation
│   ├── store/
│   │   ├── protocol.py      # DistilleryStore protocol
│   │   └── duckdb.py        # DuckDB + VSS backend
│   ├── embedding/
│   │   ├── protocol.py      # EmbeddingProvider protocol
│   │   ├── jina.py          # Jina v3 adapter
│   │   └── openai.py        # OpenAI adapter
│   ├── classification/
│   │   ├── models.py        # ClassificationResult, DeduplicationResult
│   │   ├── engine.py        # ClassificationEngine
│   │   └── dedup.py         # DeduplicationChecker
│   ├── mcp/
│   │   ├── server.py        # MCP server (FastMCP 2.x/3.x)
│   │   ├── webhooks.py      # REST webhook endpoints (/api/poll, /api/rescore, /api/maintenance)
│   │   ├── auth.py          # GitHub OAuth via FastMCP GitHubProvider
│   │   ├── middleware.py     # Request logging, rate limiting, security headers
│   │   ├── budget.py        # Embedding API budget tracking
│   │   └── __main__.py      # CLI: --transport stdio|http (composes MCP + webhooks)
│   └── feeds/
│       ├── github.py        # GitHub event adapter
│       ├── rss.py           # RSS/Atom feed adapter
│       ├── scorer.py        # Embedding-based relevance scorer
│       ├── poller.py        # Background feed poller
│       └── interests.py     # Interest extractor for source suggestions
├── tests/                   # 1600+ tests (unit + integration)
├── deploy/
│   ├── fly/                 # Fly.io deployment (persistent DuckDB)
│   └── prefect/             # Prefect Horizon deployment (MotherDuck)
└── docs/                    # This documentation site

Feed Architecture

The ambient intelligence system monitors external sources and scores relevance:

  1. Source registry — managed via /watch, stored in DuckDB
  2. Feed adapters — GitHub REST API events, RSS/Atom feeds
  3. Auto-tagging — source tags (source/github/owner/repo, source/reddit/sub, source/domain) and topic tags matched from KB vocabulary via keyword map. Tags applied at ingestion; backfill via distillery retag CLI.
  4. Relevance scoring — embedding-based cosine similarity against user interest profile, with interest boost (up to +0.15) and per-source trust weighting
  5. Interest extraction — mines existing entries for tags, domains, repos, expertise
  6. Digest generation/radar uses interest-driven semantic search to surface the most relevant feed entries (falls back to newest-first listing when interests are unavailable)
  7. Automated scheduling — Claude Code routines handle hourly feed polling, daily stale checks, and weekly maintenance. The /api/maintenance webhook endpoint remains available for orchestrated operations. Individual /hooks/* scheduling endpoints are deprecated (see #272).

Search Architecture

distillery_search uses hybrid ranking by default, combining two retrieval signals:

  1. Vector search — HNSW cosine similarity via DuckDB VSS extension
  2. Keyword search — BM25 via DuckDB FTS extension (migration 7)
  3. Fusion — Reciprocal Rank Fusion (RRF) with configurable k (default 60)
  4. Recency decay — linear weight from 1.0 (today) to recency_min_weight (default 0.5) over recency_window_days (default 90)
  5. Graceful degradation — falls back to vector-only if FTS extension unavailable

distillery_find_similar uses pure cosine similarity (no hybrid) — dedup thresholds depend on calibrated absolute scores.

Scheduling

Distillery uses Claude Code routines for all scheduled tasks:

Routine Frequency Purpose
distillery-feed-poll Hourly Poll all feed sources
distillery-stale-check Daily Find entries needing refresh or archival
distillery-weekly-maintenance Weekly Stats, stale entries, feed activity, digest

Routines run automatically in the background when Claude Code is active. Configure them via /setup.

Webhook Endpoints (Partially Deprecated)

REST endpoints mounted at /api/* alongside the MCP server in HTTP mode. Enabled when both DISTILLERY_WEBHOOK_SECRET is set and the runtime flag config.server.webhooks.enabled is true.

Endpoint Operation Cooldown Status
POST /api/poll Poll all feed sources (alias: POST /api/hooks/poll) 5 min Deprecated — use /api/maintenance or routines
POST /api/rescore Re-score feed entries (alias: POST /api/hooks/rescore) 1 hour Deprecated — use /api/maintenance or routines
POST /api/classify-batch Batch classification (alias: POST /api/hooks/classify-batch) 5 min Deprecated — use /api/maintenance or routines
POST /api/maintenance Orchestrated maintenance (poll + rescore + classify + retention) 6 hours Active

Deprecation Notice

The individual /api/poll, /api/rescore, and /api/classify-batch endpoints (and their /api/hooks/* aliases) are deprecated. Scheduling should use Claude Code routines or the orchestrated /api/maintenance endpoint instead. See issue #272 for migration details.

Auth: Authorization: Bearer <DISTILLERY_WEBHOOK_SECRET> with hmac.compare_digest.

Hardening: Per-endpoint asyncio.Lock serializes cooldown checks. BodySizeLimitMiddleware + RateLimitMiddleware (10 req/min, 100 req/hour). Cooldown timestamps persisted to DuckDB via get_metadata/set_metadata.

Audit: Each invocation stores a webhook_audit:{endpoint} metadata record with timestamp, status, and response data. All tool invocations and login events are recorded in the audit_log table, queryable via distillery_list(output_mode="audit") or store.query_audit_log().