Daniel Henry ab14d55824
All checks were successful
Build and Publish Docker Image / build-and-push (push) Successful in 3m1s
Merge pull request 'Add YAML config support and Compose deployment example' (#6) from docs/mkdocs into main
Reviewed-on: #6
2026-04-09 21:14:23 +00:00
2026-01-28 11:42:27 -06:00
2026-01-28 12:10:42 -06:00
2026-01-28 11:42:27 -06:00

email-classifier

FastAPI service that classifies email using a configurable LLM backend, returns richer structured extraction, and tracks duplicate classifications using Outlook-aware dedupe.

Environment configuration

LLM defaults:

export LLM_PROVIDER=openai
export LLM_BASE_URL=http://ollama.internal.henryhosted.com:9292/v1
export LLM_API_KEY=none
export LLM_MODEL=qwen2.5-7b-instruct.q4_k_m
export LLM_TEMPERATURE=0.1
export LLM_TIMEOUT_SECONDS=60
export LLM_MAX_RETRIES=3

MiniMax via Anthropic-compatible API:

export LLM_PROVIDER=anthropic
export LLM_BASE_URL=https://api.minimax.io/anthropic
export LLM_API_KEY=your_minimax_key
export LLM_MODEL=MiniMax-M2.7

Optional local dedupe store path:

export EMAIL_CLASSIFIER_DB_PATH=.data/email_classifier.db

Input shape

The request model accepts either:

  • simplified input via email_data
  • or native Outlook-style fields directly

Full Outlook-shaped example:

{
  "id": "AAMk...",
  "internetMessageId": "<...@...>",
  "conversationId": "AAQk...",
  "subject": "MB Printer",
  "bodyPreview": "Good morning, ...",
  "body": {
    "contentType": "html",
    "content": "<html>...(full HTML body)</html>"
  },
  "sender": {
    "emailAddress": {
      "name": "Bobbi Johnson",
      "address": "bobbi.johnson@grandportage.com"
    }
  },
  "from": {
    "emailAddress": {
      "name": "Bobbi Johnson",
      "address": "bobbi.johnson@grandportage.com"
    }
  },
  "toRecipients": [
    {
      "emailAddress": {
        "name": "IT Helpdesk Mail",
        "address": "helpdeskmail@grandportage.com"
      }
    }
  ],
  "ccRecipients": [],
  "bccRecipients": [],
  "replyTo": [],
  "receivedDateTime": "2026-02-19T15:27:35Z",
  "sentDateTime": "2026-02-19T15:27:32Z",
  "hasAttachments": false,
  "importance": "normal",
  "isRead": false,
  "flag": { "flagStatus": "notFlagged" },
  "provider": "anthropic",
  "base_url": "https://api.minimax.io/anthropic",
  "model": "MiniMax-M2.7"
}

Simplified request example:

{
  "email_data": {
    "subject": "MB Printer",
    "body": "<html>...</html>"
  },
  "id": "AAMk...",
  "conversationId": "AAQk..."
}

Response example

{
  "needs_action": true,
  "category": "question",
  "priority": "high",
  "task_description": "Investigate MB Printer issue and reply",
  "reasoning": "The email appears to describe an issue requiring action.",
  "confidence": 0.91,
  "details": {
    "summary": "Printer issue reported in the MB area.",
    "suggested_title": "Handle MB Printer issue",
    "suggested_notes": "Review the printer problem, identify urgency, and reply with next steps.",
    "deadline": null,
    "people": [],
    "organizations": [],
    "attachments_referenced": [],
    "next_steps": ["Review issue", "Respond to sender"],
    "key_points": ["Printer issue reported"],
    "source_signals": ["request"],
    "dedupe_key": "..."
  },
  "dedupe": {
    "status": "new",
    "seen_count": 1,
    "matched_on": "none",
    "message_id": "AAMk...",
    "conversation_id": "AAQk...",
    "fingerprint": "..."
  }
}

Dedupe precedence

  1. id for exact Outlook message match
  2. conversationId for thread grouping
  3. normalized subject + preview/body fingerprint fallback

Statuses:

  • new: no prior similar email seen
  • duplicate: same dedupe target and same extracted result as before
  • updated: matched prior email, but extracted result changed

This is intentionally heuristic for the fallback path.

Notes

  • No Todoist integration lives in this API.
  • Dedupe is local and intended to help downstream workflows avoid obvious duplicates.
  • SQLite is used for lightweight local dedupe tracking.
Description
No description provided
Readme 124 KiB
Languages
Python 97.4%
Dockerfile 2.6%