Add MkDocs documentation
Covers: overview, setup, API reference, configuration, testing, deployment, and known quirks.
This commit is contained in:
61
docs/index.md
Normal file
61
docs/index.md
Normal file
@@ -0,0 +1,61 @@
|
||||
# email-classifier
|
||||
|
||||
FastAPI service that classifies emails using a configurable LLM backend. It accepts Outlook-shaped email JSON payloads, extracts structured classification data, and tracks duplicate classifications using a local SQLite dedupe store.
|
||||
|
||||
## Purpose
|
||||
|
||||
This service is designed to help workflow systems (e.g., Todoist ticket creation) automatically process incoming emails by:
|
||||
|
||||
- Determining whether an email requires action
|
||||
- Extracting priority, category, suggested task title/notes, people, organizations, and deadlines
|
||||
- Deduplicating repeated emails based on Outlook message ID, conversation ID, or content fingerprinting
|
||||
|
||||
## Key Features
|
||||
|
||||
- **Configurable LLM providers** — OpenAI-compatible (Ollama, LM Studio, OpenAI) or Anthropic-compatible (MiniMax, Anthropic API)
|
||||
- **Outlook-shaped input** — Accepts native Microsoft Graph API email payloads with no transformation required
|
||||
- **Simplified input** — Also accepts a minimal `email_data` shape with just `subject` and `body`
|
||||
- **Deduplication** — Local SQLite store tracks seen emails by message ID, conversation ID, or content fingerprint
|
||||
- **Structured extraction** — Returns classification, priority, suggested task title/notes, people, organizations, deadlines, and more
|
||||
|
||||
## Project Structure
|
||||
|
||||
```
|
||||
email-classifier/
|
||||
├── app/
|
||||
│ ├── main.py # FastAPI app entry point
|
||||
│ ├── config.py # Pydantic settings from environment variables
|
||||
│ ├── classifier.py # Core classification orchestration
|
||||
│ ├── llm_adapters.py # OpenAI- and Anthropic-compatible adapter layer
|
||||
│ ├── models.py # Pydantic request/response models
|
||||
│ ├── prompts.py # System prompt sent to the LLM
|
||||
│ ├── sync.py # Deduplication logic and content fingerprinting
|
||||
│ ├── dedupe_store.py # SQLite persistence for dedupe tracking
|
||||
│ ├── routers/
|
||||
│ │ └── classify_email.py # /classify POST endpoint
|
||||
│ └── helpers/
|
||||
│ ├── clean_email_html.py
|
||||
│ ├── extract_latest_message.py
|
||||
│ └── remove_disclaimer.py
|
||||
├── docs/ # MkDocs documentation (this site)
|
||||
├── Dockerfile
|
||||
├── pyproject.toml
|
||||
└── uv.lock
|
||||
```
|
||||
|
||||
## Output Classification Schema
|
||||
|
||||
Emails are classified into one of these categories:
|
||||
|
||||
| Category | Description |
|
||||
|---|---|
|
||||
| `action_required` | Direct request requiring user action |
|
||||
| `question` | Question needing a response |
|
||||
| `fyi` | Informational, no reply needed |
|
||||
| `newsletter` | Newsletter or publication |
|
||||
| `promotional` | Marketing or sales outreach |
|
||||
| `automated` | Automated system notification |
|
||||
| `alert` | I.T. or security alert |
|
||||
| `uncategorized` | Fallback when classification fails |
|
||||
|
||||
Priority is one of: `high`, `medium`, `low`.
|
||||
Reference in New Issue
Block a user