Fly.io Deployment¶
Deploy the Distillery MCP server to Fly.io with persistent DuckDB storage on a volume, GitHub OAuth, and scale-to-zero billing.
Demo Server
The hosted instance at distillery-mcp.fly.dev is a demo server for evaluation and testing only. Do not store sensitive, proprietary, or confidential data. There are no uptime guarantees, data may be reset without notice, and storage is not encrypted at rest. For production use, deploy your own instance using the instructions below.
Prerequisites¶
- Fly CLI installed
- A Fly.io account:
fly auth login - A GitHub OAuth App registered
Configuration Files¶
| File | Purpose |
|---|---|
deploy/fly/Dockerfile |
Python 3.13-slim image with distillery-mcp entrypoint |
deploy/fly/fly.toml |
Fly Machine config (scale-to-zero, volume mount, health check) |
deploy/fly/distillery-fly.yaml |
Distillery config (DuckDB on volume, Jina embeddings, GitHub OAuth) |
Quick Start¶
All commands run from the repository root.
1. Create the app¶
Update the app value in deploy/fly/fly.toml to match.
2. Create a persistent volume¶
Creates a 1 GB NVMe volume for DuckDB. Data persists across deploys and restarts.
3. Set secrets¶
fly secrets set \
JINA_API_KEY=<your-jina-api-key> \
GITHUB_CLIENT_ID=<your-github-client-id> \
GITHUB_CLIENT_SECRET=<your-github-client-secret> \
DISTILLERY_BASE_URL=https://<app-name>.fly.dev \
--app <app-name>
4. Deploy¶
5. Verify¶
fly status --app <app-name>
fly logs --app <app-name>
curl -X POST https://<app-name>.fly.dev/mcp \
-H "Content-Type: application/json" \
-d '{"jsonrpc":"2.0","method":"tools/list","id":1}'
Connecting from Claude Code¶
{
"mcpServers": {
"distillery": {
"url": "https://<app-name>.fly.dev/mcp",
"transport": "http"
}
}
}
Claude Code triggers the GitHub OAuth flow on first connection.
Architecture¶
| Aspect | Details |
|---|---|
| Transport | Streamable HTTP (FastMCP) on port 8000 |
| Storage | Local DuckDB on Fly Volume (/data/distillery.db) |
| Auth | GitHub OAuth via FastMCP GitHubProvider (identity gate only) |
| Scaling | Single machine, scale-to-zero when idle |
| Concurrency | hard_limit=10, soft_limit=5 |
| Memory | 512 MB minimum |
| Cost | ~$3-5/month (512 MB shared CPU + 1 GB volume) |
Authentication Model¶
GitHub OAuth is used purely as an identity gate — it verifies who the caller is, not what they can access on GitHub. The server never gains access to user repositories or organizations.
The flow (handled by FastMCP's GitHubProvider):
- OAuth requests only the
userscope (read-only public profile) GitHubTokenVerifiercallshttps://api.github.com/userto verify tokens- Identity claims (
login,name,email) are available to tool handlers - The raw GitHub token is never exposed to application code
Rate Limiting¶
| Guard | Default | Purpose |
|---|---|---|
embedding_budget_daily |
500 | Max Jina API calls/day (0 = unlimited) |
max_db_size_mb |
900 | Reject writes above this DB size |
warn_db_size_pct |
80 | Warn in distillery_status at this % |
Budget counters are stored in DuckDB's _meta table and survive scale-to-zero restarts.
Backup¶
Fly takes automatic daily volume snapshots (5-day retention). For additional safety: