# email-classifier FastAPI service that classifies email using a configurable LLM backend, returns richer structured extraction, and tracks duplicate classifications using Outlook-aware dedupe. ## Environment configuration LLM defaults: ```bash export LLM_PROVIDER=openai export LLM_BASE_URL=http://ollama.internal.henryhosted.com:9292/v1 export LLM_API_KEY=none export LLM_MODEL=qwen2.5-7b-instruct.q4_k_m export LLM_TEMPERATURE=0.1 export LLM_TIMEOUT_SECONDS=60 export LLM_MAX_RETRIES=3 ``` MiniMax via Anthropic-compatible API: ```bash export LLM_PROVIDER=anthropic export LLM_BASE_URL=https://api.minimax.io/anthropic export LLM_API_KEY=your_minimax_key export LLM_MODEL=MiniMax-M2.7 ``` Optional local dedupe store path: ```bash export EMAIL_CLASSIFIER_DB_PATH=.data/email_classifier.db ``` ## Input shape The request model accepts either: - simplified input via `email_data` - or native Outlook-style fields directly Full Outlook-shaped example: ```json { "id": "AAMk...", "internetMessageId": "<...@...>", "conversationId": "AAQk...", "subject": "MB Printer", "bodyPreview": "Good morning, ...", "body": { "contentType": "html", "content": "...(full HTML body)" }, "sender": { "emailAddress": { "name": "Bobbi Johnson", "address": "bobbi.johnson@grandportage.com" } }, "from": { "emailAddress": { "name": "Bobbi Johnson", "address": "bobbi.johnson@grandportage.com" } }, "toRecipients": [ { "emailAddress": { "name": "IT Helpdesk Mail", "address": "helpdeskmail@grandportage.com" } } ], "ccRecipients": [], "bccRecipients": [], "replyTo": [], "receivedDateTime": "2026-02-19T15:27:35Z", "sentDateTime": "2026-02-19T15:27:32Z", "hasAttachments": false, "importance": "normal", "isRead": false, "flag": { "flagStatus": "notFlagged" }, "provider": "anthropic", "base_url": "https://api.minimax.io/anthropic", "model": "MiniMax-M2.7" } ``` Simplified request example: ```json { "email_data": { "subject": "MB Printer", "body": "..." }, "id": "AAMk...", "conversationId": "AAQk..." } ``` ## Response example ```json { "needs_action": true, "category": "question", "priority": "high", "task_description": "Investigate MB Printer issue and reply", "reasoning": "The email appears to describe an issue requiring action.", "confidence": 0.91, "details": { "summary": "Printer issue reported in the MB area.", "suggested_title": "Handle MB Printer issue", "suggested_notes": "Review the printer problem, identify urgency, and reply with next steps.", "deadline": null, "people": [], "organizations": [], "attachments_referenced": [], "next_steps": ["Review issue", "Respond to sender"], "key_points": ["Printer issue reported"], "source_signals": ["request"], "dedupe_key": "..." }, "dedupe": { "status": "new", "seen_count": 1, "matched_on": "none", "message_id": "AAMk...", "conversation_id": "AAQk...", "fingerprint": "..." } } ``` ## Dedupe precedence 1. `id` for exact Outlook message match 2. `conversationId` for thread grouping 3. normalized subject + preview/body fingerprint fallback Statuses: - `new`: no prior similar email seen - `duplicate`: same dedupe target and same extracted result as before - `updated`: matched prior email, but extracted result changed This is intentionally heuristic for the fallback path. ## Notes - No Todoist integration lives in this API. - Dedupe is local and intended to help downstream workflows avoid obvious duplicates. - SQLite is used for lightweight local dedupe tracking.