8d1109c3095b5ae34896dc8fabce93a253da3ccb
email-classifier
FastAPI service that classifies email using a configurable LLM backend, returns richer structured extraction, and tracks duplicate classifications using Outlook-aware dedupe.
Environment configuration
LLM defaults:
export LLM_PROVIDER=openai
export LLM_BASE_URL=http://ollama.internal.henryhosted.com:9292/v1
export LLM_API_KEY=none
export LLM_MODEL=qwen2.5-7b-instruct.q4_k_m
export LLM_TEMPERATURE=0.1
export LLM_TIMEOUT_SECONDS=60
export LLM_MAX_RETRIES=3
MiniMax via Anthropic-compatible API:
export LLM_PROVIDER=anthropic
export LLM_BASE_URL=https://api.minimax.io/anthropic
export LLM_API_KEY=your_minimax_key
export LLM_MODEL=MiniMax-M2.7
Optional local dedupe store path:
export EMAIL_CLASSIFIER_DB_PATH=.data/email_classifier.db
Input shape
The request model accepts either:
- simplified input via
email_data - or native Outlook-style fields directly
Full Outlook-shaped example:
{
"id": "AAMk...",
"internetMessageId": "<...@...>",
"conversationId": "AAQk...",
"subject": "MB Printer",
"bodyPreview": "Good morning, ...",
"body": {
"contentType": "html",
"content": "<html>...(full HTML body)</html>"
},
"sender": {
"emailAddress": {
"name": "Bobbi Johnson",
"address": "bobbi.johnson@grandportage.com"
}
},
"from": {
"emailAddress": {
"name": "Bobbi Johnson",
"address": "bobbi.johnson@grandportage.com"
}
},
"toRecipients": [
{
"emailAddress": {
"name": "IT Helpdesk Mail",
"address": "helpdeskmail@grandportage.com"
}
}
],
"ccRecipients": [],
"bccRecipients": [],
"replyTo": [],
"receivedDateTime": "2026-02-19T15:27:35Z",
"sentDateTime": "2026-02-19T15:27:32Z",
"hasAttachments": false,
"importance": "normal",
"isRead": false,
"flag": { "flagStatus": "notFlagged" },
"provider": "anthropic",
"base_url": "https://api.minimax.io/anthropic",
"model": "MiniMax-M2.7"
}
Simplified request example:
{
"email_data": {
"subject": "MB Printer",
"body": "<html>...</html>"
},
"id": "AAMk...",
"conversationId": "AAQk..."
}
Response example
{
"needs_action": true,
"category": "question",
"priority": "high",
"task_description": "Investigate MB Printer issue and reply",
"reasoning": "The email appears to describe an issue requiring action.",
"confidence": 0.91,
"details": {
"summary": "Printer issue reported in the MB area.",
"suggested_title": "Handle MB Printer issue",
"suggested_notes": "Review the printer problem, identify urgency, and reply with next steps.",
"deadline": null,
"people": [],
"organizations": [],
"attachments_referenced": [],
"next_steps": ["Review issue", "Respond to sender"],
"key_points": ["Printer issue reported"],
"source_signals": ["request"],
"dedupe_key": "..."
},
"dedupe": {
"status": "new",
"seen_count": 1,
"matched_on": "none",
"message_id": "AAMk...",
"conversation_id": "AAQk...",
"fingerprint": "..."
}
}
Dedupe precedence
idfor exact Outlook message matchconversationIdfor thread grouping- normalized subject + preview/body fingerprint fallback
Statuses:
new: no prior similar email seenduplicate: same dedupe target and same extracted result as beforeupdated: matched prior email, but extracted result changed
This is intentionally heuristic for the fallback path.
Notes
- No Todoist integration lives in this API.
- Dedupe is local and intended to help downstream workflows avoid obvious duplicates.
- SQLite is used for lightweight local dedupe tracking.
Description
Languages
Python
97.4%
Dockerfile
2.6%