4.4 KiB
Deployment
Docker
The service ships with a Dockerfile based on python:3.12-slim-bookworm using uv for fast dependency installation.
Configuration sources
The application now supports two configuration sources:
- environment variables
- a YAML config file
Load order:
- per-request overrides
- environment variables
- YAML config file
- built-in defaults
Supported config file locations:
config.ymlconfig.yaml/config/config.yml/config/config.yaml
You can also set an explicit config path with:
export EMAIL_CLASSIFIER_CONFIG=/path/to/config.yml
Example config.yml:
llm:
provider: anthropic
base_url: https://api.minimax.io/anthropic
api_key: your_api_key_here
model: MiniMax-M2.7
temperature: 0.1
timeout_seconds: 60
max_retries: 3
Building
docker build -t email-classifier .
Running
docker run -d --name email-classifier \
-p 7999:7999 \
-e EMAIL_CLASSIFIER_CONFIG=/config/config.yml \
-e EMAIL_CLASSIFIER_DB_PATH=/data/email_classifier.db \
-v /path/to/config.yml:/config/config.yml:ro \
-v /path/to/local/data:/data \
email-classifier
Mount a persistent volume for /data (or wherever EMAIL_CLASSIFIER_DB_PATH points) to preserve the dedupe database across container restarts.
Environment variables still override file-based config, so you can keep most settings in YAML and override just one or two values at deploy time.
Docker Compose example
services:
email-classifier:
image: your-registry.example.com/your-org/email-classifier:latest
container_name: email-classifier
ports:
- "7999:7999"
environment:
EMAIL_CLASSIFIER_CONFIG: /config/config.yml
EMAIL_CLASSIFIER_DB_PATH: /data/email_classifier.db
# Optional overrides. Env vars win over YAML values.
# LLM_MODEL: MiniMax-M2.7
# LLM_TIMEOUT_SECONDS: "90"
volumes:
- ./config.yml:/config/config.yml:ro
- ./data:/data
restart: unless-stopped
# If your LLM backend runs on the Docker host, one option is:
# extra_hosts:
# - "host.docker.internal:host-gateway"
Compose notes
- Mount the YAML config read-only into the container, typically at
/config/config.yml - Mount a writable volume for
/dataso dedupe state survives restarts - Override specific values with environment variables when needed
- If the LLM backend is another container on the same Compose network, use its service name in
base_url - If the LLM backend runs on the host, use
host.docker.internalor a host-gateway mapping where appropriate
Building for a Remote Registry
docker build -t \
your-registry.example.com/your-org/email-classifier:latest \
.
docker push your-registry.example.com/your-org/email-classifier:latest
GitHub Actions CI/CD
The repository includes a workflow at .github/workflows/build-publish.yaml that builds and pushes a Docker image on every push to main.
Required Secrets
Configure these in your GitHub/Gitea Actions secrets:
| Secret | Description |
|---|---|
DOCKER_REGISTRY |
Registry hostname (e.g., ghcr.io or your custom registry) |
DOCKER_USERNAME |
Registry username |
DOCKER_PASSWORD |
Registry password or access token |
The workflow tags the image as:
:latest— always points to the latest commit onmain:<sha>— the full git SHA of the triggering commit (useful for rollbacks)
Deployment Considerations
- Network access — The container needs to reach your LLM backend. If using Ollama or another service on the host, use
host.docker.internalor an explicit host-gateway mapping. - Dedupe persistence — Mount a volume for the SQLite database to persist dedupe state across deploys.
- Port — The container exposes port
7999. Map it to any host port you prefer. - Health check — The service does not currently expose a dedicated
/healthendpoint. UseGET /docsas a liveness probe.
Production Checklist
- Provide either a YAML config file or the required
LLM_*environment variables - Use HTTPS for remote
LLM_BASE_URLvalues in production - Mount a persistent volume for
EMAIL_CLASSIFIER_DB_PATH - Set appropriate resource limits (CPU/memory) on the container
- Configure
LLM_MAX_RETRIESandLLM_TIMEOUT_SECONDSto suit your LLM backend's reliability - Keep
LLM_TEMPERATURElow for consistent classification results