Add MkDocs documentation

Covers: overview, setup, API reference, configuration,
testing, deployment, and known quirks.
This commit is contained in:
Lennie S.
2026-04-09 20:24:49 +00:00
parent 17191fc489
commit 760b56bfd6
8 changed files with 691 additions and 0 deletions

61
docs/index.md Normal file
View File

@@ -0,0 +1,61 @@
# email-classifier
FastAPI service that classifies emails using a configurable LLM backend. It accepts Outlook-shaped email JSON payloads, extracts structured classification data, and tracks duplicate classifications using a local SQLite dedupe store.
## Purpose
This service is designed to help workflow systems (e.g., Todoist ticket creation) automatically process incoming emails by:
- Determining whether an email requires action
- Extracting priority, category, suggested task title/notes, people, organizations, and deadlines
- Deduplicating repeated emails based on Outlook message ID, conversation ID, or content fingerprinting
## Key Features
- **Configurable LLM providers** — OpenAI-compatible (Ollama, LM Studio, OpenAI) or Anthropic-compatible (MiniMax, Anthropic API)
- **Outlook-shaped input** — Accepts native Microsoft Graph API email payloads with no transformation required
- **Simplified input** — Also accepts a minimal `email_data` shape with just `subject` and `body`
- **Deduplication** — Local SQLite store tracks seen emails by message ID, conversation ID, or content fingerprint
- **Structured extraction** — Returns classification, priority, suggested task title/notes, people, organizations, deadlines, and more
## Project Structure
```
email-classifier/
├── app/
│ ├── main.py # FastAPI app entry point
│ ├── config.py # Pydantic settings from environment variables
│ ├── classifier.py # Core classification orchestration
│ ├── llm_adapters.py # OpenAI- and Anthropic-compatible adapter layer
│ ├── models.py # Pydantic request/response models
│ ├── prompts.py # System prompt sent to the LLM
│ ├── sync.py # Deduplication logic and content fingerprinting
│ ├── dedupe_store.py # SQLite persistence for dedupe tracking
│ ├── routers/
│ │ └── classify_email.py # /classify POST endpoint
│ └── helpers/
│ ├── clean_email_html.py
│ ├── extract_latest_message.py
│ └── remove_disclaimer.py
├── docs/ # MkDocs documentation (this site)
├── Dockerfile
├── pyproject.toml
└── uv.lock
```
## Output Classification Schema
Emails are classified into one of these categories:
| Category | Description |
|---|---|
| `action_required` | Direct request requiring user action |
| `question` | Question needing a response |
| `fyi` | Informational, no reply needed |
| `newsletter` | Newsletter or publication |
| `promotional` | Marketing or sales outreach |
| `automated` | Automated system notification |
| `alert` | I.T. or security alert |
| `uncategorized` | Fallback when classification fails |
Priority is one of: `high`, `medium`, `low`.