Files
email-classifier/docs/index.md
Lennie S. 760b56bfd6 Add MkDocs documentation
Covers: overview, setup, API reference, configuration,
testing, deployment, and known quirks.
2026-04-09 20:24:49 +00:00

2.8 KiB

email-classifier

FastAPI service that classifies emails using a configurable LLM backend. It accepts Outlook-shaped email JSON payloads, extracts structured classification data, and tracks duplicate classifications using a local SQLite dedupe store.

Purpose

This service is designed to help workflow systems (e.g., Todoist ticket creation) automatically process incoming emails by:

  • Determining whether an email requires action
  • Extracting priority, category, suggested task title/notes, people, organizations, and deadlines
  • Deduplicating repeated emails based on Outlook message ID, conversation ID, or content fingerprinting

Key Features

  • Configurable LLM providers — OpenAI-compatible (Ollama, LM Studio, OpenAI) or Anthropic-compatible (MiniMax, Anthropic API)
  • Outlook-shaped input — Accepts native Microsoft Graph API email payloads with no transformation required
  • Simplified input — Also accepts a minimal email_data shape with just subject and body
  • Deduplication — Local SQLite store tracks seen emails by message ID, conversation ID, or content fingerprint
  • Structured extraction — Returns classification, priority, suggested task title/notes, people, organizations, deadlines, and more

Project Structure

email-classifier/
├── app/
│   ├── main.py              # FastAPI app entry point
│   ├── config.py            # Pydantic settings from environment variables
│   ├── classifier.py        # Core classification orchestration
│   ├── llm_adapters.py       # OpenAI- and Anthropic-compatible adapter layer
│   ├── models.py            # Pydantic request/response models
│   ├── prompts.py           # System prompt sent to the LLM
│   ├── sync.py              # Deduplication logic and content fingerprinting
│   ├── dedupe_store.py      # SQLite persistence for dedupe tracking
│   ├── routers/
│   │   └── classify_email.py # /classify POST endpoint
│   └── helpers/
│       ├── clean_email_html.py
│       ├── extract_latest_message.py
│       └── remove_disclaimer.py
├── docs/                    # MkDocs documentation (this site)
├── Dockerfile
├── pyproject.toml
└── uv.lock

Output Classification Schema

Emails are classified into one of these categories:

Category Description
action_required Direct request requiring user action
question Question needing a response
fyi Informational, no reply needed
newsletter Newsletter or publication
promotional Marketing or sales outreach
automated Automated system notification
alert I.T. or security alert
uncategorized Fallback when classification fails

Priority is one of: high, medium, low.