Skip to content

Operator Deployment Guide

Deploy the Distillery MCP server in HTTP mode with GitHub OAuth, enabling team members to connect from their Claude Code installations.

Prerequisites

  • Distillery available via:
  • uvx distillery-mcp (recommended — runs ephemerally, no persistent install)
  • pip install distillery-mcp (persistent install, adds CLI to PATH)
  • A public domain or IP address for your server
  • A GitHub OAuth App registered (see below)

Step 1: Register a GitHub OAuth App

  1. Visit GitHub Developer Settings
  2. Click "New OAuth App"
  3. Fill in:
  4. Application name: Distillery (or your team name)
  5. Homepage URL: https://distillery.myteam.com
  6. Authorization callback URL: https://distillery.myteam.com/mcp/auth/callback
  7. Click "Register application"
  8. Copy the Client ID and Client Secret

Callback URL

The callback URL must match exactly. The path is always /mcp/auth/callback (managed by FastMCP). HTTPS is required for production.

Step 2: Set Environment Variables

# GitHub OAuth credentials
export GITHUB_CLIENT_ID="<your-client-id>"
export GITHUB_CLIENT_SECRET="<your-client-secret>"

# Base URL for OAuth callback (must be publicly accessible)
export DISTILLERY_BASE_URL="https://distillery.myteam.com"

# Embedding provider
export JINA_API_KEY="<your-jina-api-key>"

# Optional: MotherDuck (if using shared cloud storage)
export MOTHERDUCK_TOKEN="<your-motherduck-token>"

# Optional: GitHub token for private repo feed polling
export GITHUB_TOKEN="ghp_..."

Secrets

Never commit secrets to version control. Use your platform's secret management (Kubernetes secrets, Fly.io secrets, Vault, AWS Secrets Manager, etc.).

Step 3: Configure distillery.yaml

server:
  auth:
    provider: github
    client_id_env: GITHUB_CLIENT_ID
    client_secret_env: GITHUB_CLIENT_SECRET

storage:
  backend: motherduck
  database_path: md:distillery
  motherduck_token_env: MOTHERDUCK_TOKEN

embedding:
  provider: jina
  model: jina-embeddings-v3
  dimensions: 1024
  api_key_env: JINA_API_KEY

team:
  name: My Team

classification:
  confidence_threshold: 0.6
  dedup_skip_threshold: 0.95
  dedup_merge_threshold: 0.80
  dedup_link_threshold: 0.60
  dedup_limit: 5

Configuration Sections

server.auth

Field Values Description
provider github, none Auth provider. none allows unauthenticated access (dev only)
client_id_env env var name Environment variable holding GitHub Client ID
client_secret_env env var name Environment variable holding GitHub Client Secret

storage

Field Values Description
backend duckdb, motherduck Storage backend
database_path path Local path, s3://..., or md:... for MotherDuck

Startup Validations

Distillery validates configuration at startup:

  • MotherDuck: database_path must start with md:; if the token env var is not set, a warning is logged and the server continues (it will attempt to connect without authentication)
  • GitHub OAuth: GITHUB_CLIENT_ID, GITHUB_CLIENT_SECRET, and DISTILLERY_BASE_URL must be set; missing credentials prevent HTTP transport from starting

Step 4: Start the Server

distillery-mcp --transport http --port 8000

Verify

curl -I https://distillery.myteam.com/mcp
# Expected: HTTP 200, 401, or 405

Then have a team member follow the Team Member Guide to connect.

Deployment Scenarios

Local Development (no auth)

server:
  auth:
    provider: none
distillery-mcp --transport http --host 127.0.0.1 --port 8000

Warning

Unauthenticated mode. Bind to 127.0.0.1 to restrict to localhost only.

Production with MotherDuck

server:
  auth:
    provider: github
storage:
  backend: motherduck
  database_path: md:distillery

Production with S3 Storage

storage:
  backend: duckdb
  database_path: s3://my-bucket/distillery/distillery.db
  s3_region: us-east-1

S3 credentials are resolved from AWS_ACCESS_KEY_ID / AWS_SECRET_ACCESS_KEY or IAM roles.

Platform-Specific Guides

Platform-specific deployment configs live in the distill_ops repo:

  • Fly.io — persistent DuckDB on volume, scale-to-zero (~$3-5/month)
  • Prefect Horizon — managed hosting with MotherDuck

For other platforms (Docker, Kubernetes, AWS/GCP/Azure), use the root Dockerfile in the distillery repo as a starting point:

docker build -t distillery .
docker run -p 8000:8000 -e JINA_API_KEY=... distillery

The Dockerfile is a multi-stage build: a builder stage based on cgr.dev/chainguard/python:latest-dev that uses uv to resolve dependencies from uv.lock into a self-contained virtualenv and pre-installs the DuckDB VSS extension, and a distroless runtime stage based on cgr.dev/chainguard/python:latest with the Python interpreter, the prebuilt .venv, and the copied DuckDB VSS cache under /home/nonroot/.duckdb (so HNSW indexing is available without a network download at startup). No shell, no apk, no uv binary, no compilers, and no build cache are shipped to production. The runtime image runs as the built-in nonroot user (uid 65532). To debug a running container there is no shell; exec into the Python REPL directly (docker run --rm -it --entrypoint /app/.venv/bin/python <image>) or rebuild against the -dev tag for a debug variant.

Pin a specific uv version via the UV_VERSION build arg if you need reproducibility:

docker build --build-arg UV_VERSION=0.11.3 -t distillery .

Database Migrations

Distillery uses a forward-only schema migration system. Migrations run automatically on startup — no manual steps are needed for additive changes (new columns, new tables).

How It Works

  1. On startup, Distillery reads schema_version from the _meta table
  2. Any pending migrations (numbered higher than the current version) run in order
  3. Each migration runs in a transaction — if it fails, the database is unchanged
  4. After all migrations complete, schema_version, duckdb_version, and vss_version are updated in _meta

Check the current schema version:

distillery status
# Shows: Schema version: 6, DuckDB: 1.5.x

Backup Before Upgrading

Before deploying a new version with schema changes, back up the knowledge base:

distillery export --output backup-$(date +%Y%m%d).json

This exports all entries and feed sources to a portable JSON file (embeddings are excluded — they're recomputed on import).

Restoring from Backup

# Merge import — adds missing entries, skips duplicates
distillery import --input backup.json

# Full replace — drops all entries and reimports (re-embeds content)
distillery import --input backup.json --mode replace

Breaking Changes (Rare)

If a release requires incompatible schema changes (e.g., embedding dimension change):

  1. distillery export --output backup.json
  2. Deploy the new version (creates fresh schema)
  3. distillery import --input backup.json --mode replace

Content is re-embedded using the new model during import.

DuckDB Version Compatibility

DuckDB's on-disk format is not guaranteed stable across minor versions. Distillery pins DuckDB to a compatible release range (~=1.5.0) and logs a warning if the stored duckdb_version in _meta differs from the running version.

Audit Log

All authenticated tool invocations and login events are recorded in the audit_log table. As of the API-hardening release there is no public MCP tool that exposes this data — operators query it directly from Python:

from distillery.config import load_config
from distillery.store.duckdb import DuckDBStore

store = DuckDBStore(load_config().storage.database_path)
rows = await store.query_audit_log(
    filters={"date_from": "2026-04-01T00:00:00Z"},
    limit=50,
)

The previous distillery_metrics(scope="audit") MCP tool was removed in the consolidation; if you need MCP-surfaced audit access, file an issue.

query_audit_log accepts a filters dict with optional user (exact user_id match), operation (exact tool name), date_from, and date_to keys. Each row contains user_id, tool, entry_id, action, outcome, and timestamp. From these rows operators can reconstruct the historical four-section view:

  • recent_logins — rows where tool == "login" (successful, failed, org-denied)
  • login_summary — aggregate counts over login rows
  • active_users — distinct user_id values with their max timestamp
  • recent_operations — rows where tool != "login"

Scaling

  • Single-worker process, suitable for teams up to ~100 active users
  • Storage delegates to the backend (MotherDuck or S3), which scales independently
  • For larger teams, deploy multiple instances behind a load balancer pointing to the same backend

Security Checklist

  • HTTPS enabled (required for OAuth)
  • GitHub OAuth App credentials stored securely
  • MotherDuck/embedding tokens stored securely
  • Server behind firewall (not directly exposed)
  • Logs do not contain sensitive data
  • Team members use GitHub accounts with 2FA
  • OAuth callback URL matches deployment URL exactly

Common Errors

Error Cause Fix
GITHUB_CLIENT_ID not set Missing OAuth credentials Set the environment variable
database_path must start with 'md:' MotherDuck config mismatch Use md:distillery as the path
MOTHERDUCK_TOKEN not set (warning) Missing token for MotherDuck Set MOTHERDUCK_TOKEN env var for authenticated access
Server failed to start on port 8000 Port in use Use --port 9000 or check lsof -i :8000