Deployment

Docker

The service ships with a Dockerfile based on python:3.12-slim-bookworm using uv for fast dependency installation.

Configuration sources

The application now supports two configuration sources:

environment variables
a YAML config file

Load order:

per-request overrides
environment variables
YAML config file
built-in defaults

Supported config file locations:

config.yml
config.yaml
/config/config.yml
/config/config.yaml

You can also set an explicit config path with:

export EMAIL_CLASSIFIER_CONFIG=/path/to/config.yml

Example config.yml:

llm:
  provider: anthropic
  base_url: https://api.minimax.io/anthropic
  api_key: your_api_key_here
  model: MiniMax-M2.7
  temperature: 0.1
  timeout_seconds: 60
  max_retries: 3

Building

docker build -t email-classifier .

Running

docker run -d --name email-classifier \
  -p 7999:7999 \
  -e EMAIL_CLASSIFIER_CONFIG=/config/config.yml \
  -e EMAIL_CLASSIFIER_DB_PATH=/data/email_classifier.db \
  -v /path/to/config.yml:/config/config.yml:ro \
  -v /path/to/local/data:/data \
  email-classifier

Mount a persistent volume for /data (or wherever EMAIL_CLASSIFIER_DB_PATH points) to preserve the dedupe database across container restarts.

Environment variables still override file-based config, so you can keep most settings in YAML and override just one or two values at deploy time.

Docker Compose example

services:
  email-classifier:
    image: your-registry.example.com/your-org/email-classifier:latest
    container_name: email-classifier
    ports:
      - "7999:7999"
    environment:
      EMAIL_CLASSIFIER_CONFIG: /config/config.yml
      EMAIL_CLASSIFIER_DB_PATH: /data/email_classifier.db
      # Optional overrides. Env vars win over YAML values.
      # LLM_MODEL: MiniMax-M2.7
      # LLM_TIMEOUT_SECONDS: "90"
    volumes:
      - ./config.yml:/config/config.yml:ro
      - ./data:/data
    restart: unless-stopped
    # If your LLM backend runs on the Docker host, one option is:
    # extra_hosts:
    #   - "host.docker.internal:host-gateway"

Compose notes

Mount the YAML config read-only into the container, typically at /config/config.yml
Mount a writable volume for /data so dedupe state survives restarts
Override specific values with environment variables when needed
If the LLM backend is another container on the same Compose network, use its service name in base_url
If the LLM backend runs on the host, use host.docker.internal or a host-gateway mapping where appropriate

Building for a Remote Registry

docker build -t \
  your-registry.example.com/your-org/email-classifier:latest \
  .

docker push your-registry.example.com/your-org/email-classifier:latest

GitHub Actions CI/CD

The repository includes a workflow at .github/workflows/build-publish.yaml that builds and pushes a Docker image on every push to main.

Required Secrets

Configure these in your GitHub/Gitea Actions secrets:

Secret	Description
`DOCKER_REGISTRY`	Registry hostname (e.g., `ghcr.io` or your custom registry)
`DOCKER_USERNAME`	Registry username
`DOCKER_PASSWORD`	Registry password or access token

The workflow tags the image as:

:latest — always points to the latest commit on main
:<sha> — the full git SHA of the triggering commit (useful for rollbacks)

Deployment Considerations

Network access — The container needs to reach your LLM backend. If using Ollama or another service on the host, use host.docker.internal or an explicit host-gateway mapping.
Dedupe persistence — Mount a volume for the SQLite database to persist dedupe state across deploys.
Port — The container exposes port 7999. Map it to any host port you prefer.
Health check — The service does not currently expose a dedicated /health endpoint. Use GET /docs as a liveness probe.

Production Checklist

Provide either a YAML config file or the required LLM_* environment variables
Use HTTPS for remote LLM_BASE_URL values in production
Mount a persistent volume for EMAIL_CLASSIFIER_DB_PATH
Set appropriate resource limits (CPU/memory) on the container
Configure LLM_MAX_RETRIES and LLM_TIMEOUT_SECONDS to suit your LLM backend's reliability
Keep LLM_TEMPERATURE low for consistent classification results

4.4 KiB Raw Permalink Blame History