Files
email-classifier/docs/api.md
2026-04-09 20:55:52 +00:00

173 lines
5.2 KiB
Markdown
Raw Permalink Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# API Reference
## `POST /classify`
Classifies a single email and returns structured extraction results.
**Endpoint:** `POST /classify`
**Content-Type:** `application/json`
---
## Request
The endpoint accepts **two input shapes**: a full Outlook-shaped payload (native Microsoft Graph API format) or a simplified `email_data` object.
### Simplified Shape
Use this for lightweight clients or testing:
```json
{
"email_data": {
"subject": "Printer issue in MB",
"body": "<html>...</html>"
},
"id": "AAMk...",
"conversationId": "AAQk..."
}
```
### Full Outlook Shape
Pass through an email directly from Microsoft Graph API:
```json
{
"id": "AAMk...",
"internetMessageId": "<abc123@mail.example.com>",
"conversationId": "AAQk...",
"subject": "MB Printer",
"bodyPreview": "Good morning, ...",
"body": {
"contentType": "html",
"content": "<html>...(full HTML body)</html>"
},
"sender": {
"emailAddress": {
"name": "Bobbi Johnson",
"address": "bobbi.johnson@grandportage.com"
}
},
"from": {
"emailAddress": {
"name": "Bobbi Johnson",
"address": "bobbi.johnson@grandportage.com"
}
},
"toRecipients": [
{
"emailAddress": {
"name": "IT Helpdesk Mail",
"address": "helpdeskmail@grandportage.com"
}
}
],
"ccRecipients": [],
"bccRecipients": [],
"replyTo": [],
"receivedDateTime": "2026-02-19T15:27:35Z",
"sentDateTime": "2026-02-19T15:27:32Z",
"hasAttachments": false,
"importance": "normal",
"isRead": false,
"flag": { "flagStatus": "notFlagged" }
}
```
### Per-Request LLM Overrides
You can override the global LLM settings for individual requests:
| Field | Type | Description |
|---|---|---|
| `provider` | `openai` | `anthropic` | Override the global LLM provider |
| `model` | `string` | Override the model name |
| `base_url` | `string` | Override the API base URL |
| `api_key` | `string` | Override the API key (excluded from logs) |
| `temperature` | `float` | Override the temperature (0.01.0) |
---
## Response
```json
{
"needs_action": true,
"category": "action_required",
"priority": "high",
"task_description": "Investigate MB Printer issue and reply",
"reasoning": "The email describes an active problem requiring I.T. attention.",
"confidence": 0.91,
"details": {
"summary": "Printer issue reported in the MB area requiring investigation.",
"suggested_title": "Handle MB Printer issue",
"suggested_notes": "Review the printer problem, identify urgency, and reply with next steps.",
"deadline": null,
"people": ["Bobbi Johnson"],
"organizations": ["Grand Portage"],
"attachments_referenced": [],
"next_steps": ["Review printer status", "Reply to Bobbi Johnson"],
"key_points": ["Printer issue in MB", "Needs on-site investigation"],
"source_signals": ["request", "problem_report"]
},
"dedupe": {
"status": "new",
"seen_count": 1,
"matched_on": "none",
"message_id": "AAMk...",
"conversation_id": "AAQk...",
"fingerprint": "a3f8b..."
}
}
```
### Response Fields
| Field | Type | Description |
|---|---|---|
| `needs_action` | `bool` | Whether the email requires user action |
| `category` | `string` | One of the 8 classification categories |
| `priority` | `string` | `high`, `medium`, or `low` |
| `task_description` | `string|null` | Short action-oriented description |
| `reasoning` | `string` | One-sentence explanation of the classification |
| `confidence` | `float` | Model confidence score (0.01.0) |
| `details` | `object` | Structured extraction (see below) |
| `dedupe` | `object` | Deduplication result (see below) |
### `details` Object
| Field | Type | Description |
|---|---|---|
| `summary` | `string|null` | Brief human-readable summary |
| `suggested_title` | `string|null` | Good task/Todoist title |
| `suggested_notes` | `string|null` | Multiline notes for a human reviewer |
| `deadline` | `string|null` | Any date/time deadline mentioned |
| `people` | `string[]` | People involved or referenced |
| `organizations` | `string[]` | Organizations, departments, vendors, teams |
| `attachments_referenced` | `string[]` | Attachment names mentioned in the email |
| `next_steps` | `string[]` | Specific recommended next actions |
| `key_points` | `string[]` | Important context bullets |
| `source_signals` | `string[]` | Signals that triggered the classification |
| `dedupe_key` | `string|null` | Content fingerprint (SHA-256) |
### `dedupe` Object
| Field | Type | Description |
|---|---|---|
| `status` | `new | duplicate | updated` | Whether this is new, a duplicate, or updated |
| `seen_count` | `int` | Number of times this email thread has been seen |
| `matched_on` | `none | id | conversation | fingerprint` | Which dedupe mechanism matched |
| `message_id` | `string|null` | Outlook `id` field if available |
| `conversation_id` | `string|null` | Outlook `conversationId` if available |
| `fingerprint` | `string` | SHA-256 content fingerprint |
---
## Error Responses
If the request is missing both `email_data` and Outlook body fields, the API returns a `422 Unprocessable Entity` with a validation error.
If classification fails after all retries, the service returns a `200` with an `uncategorized` result and `confidence: 0.0`.