Architecture¶
Distillery is built as a 4-layer system where skills (SKILL.md files) drive all user interaction, the MCP server mediates all storage access, and backends are swappable through typed Protocol interfaces.
Layers¶
| Layer | What it does | Key files |
|---|---|---|
| Skills | 14 SKILL.md files — portable, version-controlled slash commands. Not Python code. | skills/*/SKILL.md |
| MCP Server | Tools exposed over stdio (local) or streamable-HTTP (team). Built on FastMCP 2.x/3.x with @server.tool decorators. |
src/distillery/mcp/server.py |
| Webhook API | REST endpoints mounted under /api/* for orchestrated operations. The active endpoint is POST /api/maintenance (full poll → rescore → classify-batch pipeline). Individual scheduling endpoints — POST /api/poll, POST /api/rescore, POST /api/classify-batch (also reachable at POST /api/hooks/poll, /api/hooks/rescore, /api/hooks/classify-batch during the deprecation window) — are deprecated in favour of Claude Code routines and the orchestrated maintenance endpoint. Bearer token auth, per-endpoint cooldowns persisted to DuckDB. Mounted alongside MCP in HTTP mode. |
src/distillery/mcp/webhooks.py |
| Auth | MCP: GitHub OAuth with org-restricted access. Webhooks: bearer token via DISTILLERY_WEBHOOK_SECRET. Middleware handles logging, rate limiting, security headers, budget tracking. |
src/distillery/mcp/auth.py, middleware.py, budget.py |
| Core Protocols | Typed Protocol interfaces (structural subtyping, not ABCs). All storage operations are async. Includes query_audit_log for audit data access. |
src/distillery/store/protocol.py, embedding/protocol.py |
| Feeds | GitHub events and RSS/Atom polling. Authenticated via GITHUB_TOKEN for private repos. Auto-tagging (source + topic tags from KB vocabulary). Relevance scoring via embeddings. Interest extraction for source suggestions. |
src/distillery/feeds/ |
| Backends | DuckDB + VSS (HNSW) + FTS (BM25). Hybrid search with RRF fusion and recency decay. Jina v3 / OpenAI embeddings. LLM classification with dedup + conflict detection. | src/distillery/store/duckdb.py, embedding/, classification/ |
Key Design Decisions¶
Skills are SKILL.md files, not Python code. They are portable, version-controlled, and team-shareable. Claude Code loads the markdown and follows the instructions — no compilation or import required.
MCP server is the primary runtime interface. All storage access goes through the protocol, over stdio (local) or HTTP (team). Skills never access the database directly. REST webhook endpoints provide a secondary interface for automated scheduling — they share the same DuckDB store and run in the same uvicorn process.
Storage abstraction via DistilleryStore protocol. Enables future migration to Elasticsearch without rewriting skills or the MCP server.
Configurable embedding providers. Swap between Jina v3, OpenAI, or a zero-vector stub for testing via distillery.yaml.
Semantic deduplication. Prevents knowledge base pollution with configurable thresholds:
| Threshold | Default | Action |
|---|---|---|
| Skip | 0.95 | Near-duplicate — don't store |
| Merge | 0.80 | Similar enough to combine |
| Link | 0.60 | Related — store with cross-reference |
| Below 0.60 | — | Unique — store normally |
Classification with confidence scoring. LLM-based type assignment with a team review queue for low-confidence results (below the configurable confidence_threshold, default: 60%).
Core Data Model¶
The Entry dataclass (src/distillery/models.py) is the fundamental unit of knowledge:
| Field | Type | Description |
|---|---|---|
id |
str (UUID4) | Unique identifier |
content |
str | The knowledge content |
entry_type |
EntryType | session, bookmark, minutes, meeting, reference, idea, inbox, person, project, digest, github, feed |
source |
EntrySource | claude_code, manual, import, inference, documentation, external |
status |
EntryStatus | active, pending_review, archived |
tags |
list[str] | Hierarchical tags (project/distillery/decisions) |
metadata |
dict | Type-specific fields (validated per entry type) |
version |
int | Incremented on updates |
author |
str | Who created the entry |
project |
str | None | Which project context |
created_at |
datetime | Creation timestamp |
session_id |
str | None | Session grouping identifier |
verification |
VerificationStatus | Unverified, Testing, or Verified |
expires_at |
datetime | None | Optional expiration timestamp |
updated_at |
datetime | Last modification |
Project Structure¶
distillery/
├── skills/ # Claude Code skill definitions (loaded via plugin)
│ ├── distill/SKILL.md
│ ├── recall/SKILL.md
│ ├── pour/SKILL.md
│ ├── bookmark/SKILL.md
│ ├── minutes/SKILL.md
│ ├── classify/SKILL.md
│ ├── watch/SKILL.md
│ ├── radar/SKILL.md
│ ├── tune/SKILL.md
│ ├── setup/SKILL.md
│ ├── digest/SKILL.md
│ ├── gh-sync/SKILL.md
│ ├── investigate/SKILL.md
│ ├── briefing/SKILL.md
│ └── CONVENTIONS.md
├── src/distillery/
│ ├── models.py # Entry, SearchResult, enums
│ ├── config.py # YAML config loading
│ ├── security.py # Input sanitization and content validation
│ ├── store/
│ │ ├── protocol.py # DistilleryStore protocol
│ │ └── duckdb.py # DuckDB + VSS backend
│ ├── embedding/
│ │ ├── protocol.py # EmbeddingProvider protocol
│ │ ├── jina.py # Jina v3 adapter
│ │ └── openai.py # OpenAI adapter
│ ├── classification/
│ │ ├── models.py # ClassificationResult, DeduplicationResult
│ │ ├── engine.py # ClassificationEngine
│ │ └── dedup.py # DeduplicationChecker
│ ├── mcp/
│ │ ├── server.py # MCP server (FastMCP 2.x/3.x)
│ │ ├── webhooks.py # REST webhook endpoints (/api/poll, /api/rescore, /api/maintenance)
│ │ ├── auth.py # GitHub OAuth via FastMCP GitHubProvider
│ │ ├── middleware.py # Request logging, rate limiting, security headers
│ │ ├── budget.py # Embedding API budget tracking
│ │ └── __main__.py # CLI: --transport stdio|http (composes MCP + webhooks)
│ └── feeds/
│ ├── github.py # GitHub event adapter
│ ├── rss.py # RSS/Atom feed adapter
│ ├── scorer.py # Embedding-based relevance scorer
│ ├── poller.py # Background feed poller
│ └── interests.py # Interest extractor for source suggestions
├── tests/ # 1600+ tests (unit + integration)
├── deploy/
│ ├── fly/ # Fly.io deployment (persistent DuckDB)
│ └── prefect/ # Prefect Horizon deployment (MotherDuck)
└── docs/ # This documentation site
Feed Architecture¶
The ambient intelligence system monitors external sources and scores relevance:
- Source registry — managed via
/watch, stored in DuckDB - Feed adapters — GitHub REST API events, RSS/Atom feeds
- Auto-tagging — source tags (
source/github/owner/repo,source/reddit/sub,source/domain) and topic tags matched from KB vocabulary via keyword map. Tags applied at ingestion; backfill viadistillery retagCLI. - Relevance scoring — embedding-based cosine similarity against user interest profile, with interest boost (up to +0.15) and per-source trust weighting
- Interest extraction — mines existing entries for tags, domains, repos, expertise
- Digest generation —
/radaruses interest-driven semantic search to surface the most relevant feed entries (falls back to newest-first listing when interests are unavailable) - Automated scheduling — Claude Code routines handle hourly feed polling, daily stale checks, and weekly maintenance. The
/api/maintenancewebhook endpoint remains available for orchestrated operations. Individual/hooks/*scheduling endpoints are deprecated (see #272).
Search Architecture¶
distillery_search uses hybrid ranking by default, combining two retrieval signals:
- Vector search — HNSW cosine similarity via DuckDB VSS extension
- Keyword search — BM25 via DuckDB FTS extension (migration 7)
- Fusion — Reciprocal Rank Fusion (RRF) with configurable k (default 60)
- Recency decay — linear weight from 1.0 (today) to
recency_min_weight(default 0.5) overrecency_window_days(default 90) - Graceful degradation — falls back to vector-only if FTS extension unavailable
distillery_find_similar uses pure cosine similarity (no hybrid) — dedup thresholds depend on calibrated absolute scores.
Scheduling¶
Distillery uses Claude Code routines for all scheduled tasks:
| Routine | Frequency | Purpose |
|---|---|---|
distillery-feed-poll |
Hourly | Poll all feed sources |
distillery-stale-check |
Daily | Find entries needing refresh or archival |
distillery-weekly-maintenance |
Weekly | Stats, stale entries, feed activity, digest |
Routines run automatically in the background when Claude Code is active. Configure them via /setup.
Webhook Endpoints (Partially Deprecated)¶
REST endpoints mounted at /api/* alongside the MCP server in HTTP mode. Enabled when both DISTILLERY_WEBHOOK_SECRET is set and the runtime flag config.server.webhooks.enabled is true.
| Endpoint | Operation | Cooldown | Status |
|---|---|---|---|
POST /api/poll |
Poll all feed sources (alias: POST /api/hooks/poll) |
5 min | Deprecated — use /api/maintenance or routines |
POST /api/rescore |
Re-score feed entries (alias: POST /api/hooks/rescore) |
1 hour | Deprecated — use /api/maintenance or routines |
POST /api/classify-batch |
Batch classification (alias: POST /api/hooks/classify-batch) |
5 min | Deprecated — use /api/maintenance or routines |
POST /api/maintenance |
Orchestrated maintenance (poll + rescore + classify + retention) | 6 hours | Active |
Deprecation Notice
The individual /api/poll, /api/rescore, and /api/classify-batch endpoints (and their /api/hooks/* aliases) are deprecated. Scheduling should use Claude Code routines or the orchestrated /api/maintenance endpoint instead. See issue #272 for migration details.
Auth: Authorization: Bearer <DISTILLERY_WEBHOOK_SECRET> with hmac.compare_digest.
Hardening: Per-endpoint asyncio.Lock serializes cooldown checks. BodySizeLimitMiddleware + RateLimitMiddleware (10 req/min, 100 req/hour). Cooldown timestamps persisted to DuckDB via get_metadata/set_metadata.
Audit: Each invocation stores a webhook_audit:{endpoint} metadata record with timestamp, status, and response data. All tool invocations and login events are recorded in the audit_log table, queryable via distillery_list(output_mode="audit") or store.query_audit_log().