Files

Lennie S. 760b56bfd6 Add MkDocs documentation

Covers: overview, setup, API reference, configuration,
testing, deployment, and known quirks.

2026-04-09 20:24:49 +00:00

2.5 KiB

Raw Blame History

Deployment

Docker

The service ships with a Dockerfile based on python:3.12-slim-bookworm using uv for fast dependency installation.

Building

docker build -t email-classifier .

Running

docker run -d --name email-classifier \
  -p 7999:7999 \
  -e LLM_PROVIDER=openai \
  -e LLM_BASE_URL=http://your-ollama:11434/v1 \
  -e LLM_API_KEY=none \
  -e LLM_MODEL=qwen2.5-7b-instruct.q4_k_m \
  -e LLM_TEMPERATURE=0.1 \
  -e EMAIL_CLASSIFIER_DB_PATH=/data/email_classifier.db \
  -v /path/to/local/data:/data \
  email-classifier

Mount a persistent volume for /data (or wherever EMAIL_CLASSIFIER_DB_PATH points) to preserve the dedupe database across container restarts.

Building for a Remote Registry

docker build -t \
  your-registry.example.com/your-org/email-classifier:latest \
  .

docker push your-registry.example.com/your-org/email-classifier:latest

GitHub Actions CI/CD

The repository includes a workflow at .github/workflows/build-publish.yaml that builds and pushes a Docker image on every push to main.

Required Secrets

Configure these in your GitHub/Gitea Actions secrets:

Secret	Description
`DOCKER_REGISTRY`	Registry hostname (e.g., `ghcr.io` or your custom registry)
`DOCKER_USERNAME`	Registry username
`DOCKER_PASSWORD`	Registry password or access token

The workflow tags the image as:

:latest — always points to the latest commit on main
:<sha> — the full git SHA of the triggering commit (useful for rollbacks)

Deployment Considerations

Network access — The container needs to reach your LLM backend. If using Ollama on the host, use host.docker.internal (Linux) or docker.for.mac.localhost (macOS) as the base URL.
Dedupe persistence — Mount a volume for the SQLite database to persist dedupe state across deploys.
Port — The container exposes port 7999. Map it to any host port you prefer.
Health check — The service does not currently expose a dedicated /health endpoint. Use GET /docs as a liveness probe.

Production Checklist

Set LLM_API_KEY to a real key (not none) in production
Use HTTPS for LLM_BASE_URL in production
Mount a persistent volume for EMAIL_CLASSIFIER_DB_PATH
Set appropriate resource limits (CPU/memory) on the container
Configure LLM_MAX_RETRIES and LLM_TIMEOUT_SECONDS to suit your LLM backend's reliability
Set LLM_TEMPERATURE=0.1 (or similar low value) for consistent classification results

2.5 KiB Raw Blame History