Covers: overview, setup, API reference, configuration, testing, deployment, and known quirks.
2.8 KiB
2.8 KiB
email-classifier
FastAPI service that classifies emails using a configurable LLM backend. It accepts Outlook-shaped email JSON payloads, extracts structured classification data, and tracks duplicate classifications using a local SQLite dedupe store.
Purpose
This service is designed to help workflow systems (e.g., Todoist ticket creation) automatically process incoming emails by:
- Determining whether an email requires action
- Extracting priority, category, suggested task title/notes, people, organizations, and deadlines
- Deduplicating repeated emails based on Outlook message ID, conversation ID, or content fingerprinting
Key Features
- Configurable LLM providers — OpenAI-compatible (Ollama, LM Studio, OpenAI) or Anthropic-compatible (MiniMax, Anthropic API)
- Outlook-shaped input — Accepts native Microsoft Graph API email payloads with no transformation required
- Simplified input — Also accepts a minimal
email_datashape with justsubjectandbody - Deduplication — Local SQLite store tracks seen emails by message ID, conversation ID, or content fingerprint
- Structured extraction — Returns classification, priority, suggested task title/notes, people, organizations, deadlines, and more
Project Structure
email-classifier/
├── app/
│ ├── main.py # FastAPI app entry point
│ ├── config.py # Pydantic settings from environment variables
│ ├── classifier.py # Core classification orchestration
│ ├── llm_adapters.py # OpenAI- and Anthropic-compatible adapter layer
│ ├── models.py # Pydantic request/response models
│ ├── prompts.py # System prompt sent to the LLM
│ ├── sync.py # Deduplication logic and content fingerprinting
│ ├── dedupe_store.py # SQLite persistence for dedupe tracking
│ ├── routers/
│ │ └── classify_email.py # /classify POST endpoint
│ └── helpers/
│ ├── clean_email_html.py
│ ├── extract_latest_message.py
│ └── remove_disclaimer.py
├── docs/ # MkDocs documentation (this site)
├── Dockerfile
├── pyproject.toml
└── uv.lock
Output Classification Schema
Emails are classified into one of these categories:
| Category | Description |
|---|---|
action_required |
Direct request requiring user action |
question |
Question needing a response |
fyi |
Informational, no reply needed |
newsletter |
Newsletter or publication |
promotional |
Marketing or sales outreach |
automated |
Automated system notification |
alert |
I.T. or security alert |
uncategorized |
Fallback when classification fails |
Priority is one of: high, medium, low.