1b2c7db92493190eca78d6aa649a9257b1a0c468
email-classifier
FastAPI service that classifies email using a configurable LLM backend, returns richer structured extraction, and tracks duplicate classifications using fingerprint-based dedupe.
Environment configuration
LLM defaults:
export LLM_PROVIDER=openai
export LLM_BASE_URL=http://ollama.internal.henryhosted.com:9292/v1
export LLM_API_KEY=none
export LLM_MODEL=qwen2.5-7b-instruct.q4_k_m
export LLM_TEMPERATURE=0.1
export LLM_TIMEOUT_SECONDS=60
export LLM_MAX_RETRIES=3
MiniMax via Anthropic-compatible API:
export LLM_PROVIDER=anthropic
export LLM_BASE_URL=https://api.minimax.io/anthropic
export LLM_API_KEY=your_minimax_key
export LLM_MODEL=MiniMax-M2.7
Optional local dedupe store path:
export EMAIL_CLASSIFIER_DB_PATH=.data/email_classifier.db
API
POST /classify
This overhaul is intended to return richer extraction. Top-level compatibility is not required.
Request example:
{
"email_data": {
"subject": "Can you review this by Friday?",
"body": "Hi Daniel, please review the attached budget proposal."
},
"from_address": "sender@example.com",
"received_at": "2026-04-09T12:55:00Z",
"provider": "anthropic",
"base_url": "https://api.minimax.io/anthropic",
"model": "MiniMax-M2.7"
}
Response example:
{
"needs_action": true,
"category": "question",
"priority": "high",
"task_description": "Review the budget proposal and respond by Friday",
"reasoning": "Direct request with a deadline requires follow-up",
"confidence": 0.91,
"details": {
"summary": "Budget proposal review requested with Friday deadline.",
"suggested_title": "Review budget proposal and respond by Friday",
"suggested_notes": "Requester asked for feedback on attached budget proposal before Friday.",
"deadline": "Friday",
"people": ["Daniel"],
"organizations": [],
"attachments_referenced": ["budget proposal"],
"next_steps": ["Review attachment", "Reply with feedback"],
"key_points": ["Deadline is Friday"],
"source_signals": ["request", "deadline"],
"dedupe_key": "..."
},
"dedupe": {
"status": "new",
"seen_count": 1,
"matched_on": "none",
"subject_key": "...",
"fingerprint": "..."
}
}
Dedupe behavior
The API does not create or update Todoist tasks. It only returns richer extraction and local dedupe metadata for downstream automation like n8n.
Matching strategy:
- normalized subject plus sender-derived
subject_key - full content fingerprint fallback based on sender + normalized subject + cleaned body
Statuses:
new: no prior similar email seenduplicate: same dedupe target and same extracted result as beforeupdated: matched prior email, but extracted result changed
This is intentionally heuristic, not perfect.
Architecture
app/classifier.py: classification orchestration and dedupe handoffapp/prompts.py: richer extraction promptapp/sync.py: subject normalization, fingerprinting, dedupe applicationapp/dedupe_store.py: SQLite-backed dedupe storeapp/llm_adapters.py: provider adaptersapp/config.py: LLM settings
Notes
- No Todoist integration lives in this API.
- Dedupe is best-effort and designed to help downstream workflows avoid obvious duplicates.
- SQLite is used for lightweight local dedupe tracking.
Description
Languages
Python
97.4%
Dockerfile
2.6%