# Work Queue API — Specification (Python Rewrite) ## Overview A lightweight internal API that tracks the full lifecycle of work items across TheLab agents. Marcus A. (main dispatcher) submits work; agents poll for their queue and update status; Marcus monitors for exceptions. --- ## Tech Stack - **Language:** Python 3.12+ - **Package manager:** uv - **Web framework:** FastAPI (or Flask if preferred) - **Database:** PostgreSQL - **Docs:** MkDocs - **Container:** Docker, pushed to `git.danhenry.dev/thelab/work-queue-api` --- ## Database Schema (PostgreSQL) ### Table: `projects` | Column | Type | Notes | |---|---|---| | id | UUID PRIMARY KEY | | | name | TEXT NOT NULL | Human-readable project name | | external_ref | TEXT | Todoist project ID, GitHub repo, etc. (optional) | | created_at | TIMESTAMPTZ | ISO8601 | | updated_at | TIMESTAMPTZ | ISO8601 | ### Table: `work_items` | Column | Type | Notes | |---|---|---| | id | UUID PRIMARY KEY | | | project_id | UUID FK | References projects.id (optional) | | type | TEXT NOT NULL | e.g. `code_review`, `bug_fix`, `infra_setup`, `gitea_issue` | | description | TEXT NOT NULL | Human-readable summary | | payload | JSONB | Type-specific fields | | priority | INTEGER | 1-5, 1=highest. Default 3 | | status | TEXT NOT NULL | See Status Lifecycle below | | assigned_agent | TEXT | e.g. `steve-w`. NULL until dispatched | | created_by | TEXT | e.g. `marcus-a`, `gitea-watcher`, `bms-ticket-workflow` | | created_at | TIMESTAMPTZ | ISO8601 | | updated_at | TIMESTAMPTZ | ISO8601 | | completed_at | TIMESTAMPTZ | ISO8601, set when status → completed/failed/cancelled | | outcome | TEXT | `success`, `failed`, `cancelled`, NULL | | notes | TEXT | Agent-added notes, URLs, context | ### Status Lifecycle ``` queued → dispatched → in_progress → completed ↘ blocked ↘ failed ↘ cancelled (from queued or dispatched only) ``` - `queued` — New work, waiting for Marcus to dispatch - `dispatched` — Marcus has assigned to an agent (agent has not yet picked it up) - `in_progress` — Agent acknowledged and is working it - `blocked` — Agent hit a holding condition (waiting on external input, dependencies, etc.) - `failed` — Agent attempted but hit an unrecoverable error - `completed` — Agent finished successfully; Marcus reviews before marking truly done - `cancelled` — Marcus killed it before work started ### Table: `dispatch_log` | Column | Type | Notes | |---|---|---| | id | UUID PRIMARY KEY | | | work_item_id | UUID FK | References work_items.id | | dispatched_at | TIMESTAMPTZ | ISO8601 | | agent | TEXT | Which agent it was dispatched to | | completed_at | TIMESTAMPTZ | ISO8601, when status reached terminal state | | outcome | TEXT | success, failed, cancelled | ### Constraints - One `in_progress` work item per agent at any time (enforced via DB constraint or application logic) - `completed_at` and `outcome` can only be set when status is terminal (`completed`, `failed`, `cancelled`) --- ## API Endpoints ### Projects **`POST /projects`** — create project **`GET /projects`** — list all **`GET /projects/:id`** — get one **`PATCH /projects/:id`** — update name or external_ref ### Work Items **`POST /work`** — create item (status=`queued` on create) **`GET /work`** — list with filters: ?status=, ?agent=, ?project_id=, ?since= **Sort:** priority ASC, created_at ASC **`GET /work/:id`** — single item with dispatch history **`PATCH /work/:id`** — update status, outcome, notes, assigned_agent **`DELETE /work/:id`** — cancel (status=cancelled). Returns 204. ### Monitoring (for Marcus heartbeat) **`GET /work?status=in_progress`** — what's being worked on right now **`GET /work?status=blocked`** — items that need intervention **`GET /work?status=failed`** — items that need review **`GET /work?status=completed&since=`** — completed since last check --- ## Dispatch Flow 1. Marcus POSTs /work → status=`queued` 2. Marcus PATCHs /work/:id with status=`dispatched`, assigned_agent=steve-w 3. (optional) Immediately PATCH to status=`in_progress` 4. Steve polls GET /work?agent=steve-w&status=dispatched, PATCHes to in_progress, works, PATCHes completed --- ## Stale Task Detection Marcus's heartbeat checks `in_progress` items. If any item has `updated_at` older than 30 minutes, Marcus marks it `blocked` and alerts Daniel. --- ## Project Structure ``` work-queue-api/ ├── SPEC.md ├── Dockerfile ├── docker-compose.yml ← must include postgres container ├── .github/ │ └── workflows/ │ └── ci.yml ← build + push to git.danhenry.dev/thelab/work-queue-api:latest ├── pyproject.toml / uv project files ├── mkdocs.yml ← MkDocs configuration ├── docs/ │ └── index.md ← Docker Compose example + usage docs ├── app/ │ ├── __init__.py │ ├── main.py ← FastAPI app entry │ ├── config.py ← Settings (DATABASE_URL, PORT, etc.) │ ├── models.py ← Pydantic models │ ├── db.py ← PostgreSQL connection │ ├── routers/ │ │ ├── __init__.py │ │ ├── projects.py │ │ └── work.py │ └── migrations/ │ └── 001_initial.sql └── tests/ └── test_api.py ``` --- ## CI/CD GitHub Actions (`ci.yml`): 1. Build Docker image on push to `main` 2. Push to `git.danhenry.dev/thelab/work-queue-api:latest` 3. Tag with short SHA Docker registry: `git.danhenry.dev` Secrets available in thelab org: `DOCKER_REGISTRY`, `DOCKER_USERNAME`, `DOCKER_PASSWORD` --- ## Docker Compose Example (must be in docs AND in docker-compose.yml) ```yaml version: '3.8' services: api: image: git.danhenry.dev/thelab/work-queue-api:latest ports: - "8080:8080" environment: - DATABASE_URL=postgresql://postgres:password@db:5432/work_queue - PORT=8080 depends_on: db: condition: service_healthy restart: unless-stopped healthcheck: test: ["CMD", "wget", "-qO-", "http://localhost:8080/health"] interval: 30s timeout: 10s retries: 3 db: image: postgres:16-alpine environment: - POSTGRES_PASSWORD=password - POSTGRES_DB=work_queue volumes: - ./data/postgres:/var/lib/postgresql/data healthcheck: test: ["CMD-SHELL", "pg_isready -U postgres"] interval: 10s timeout: 5s retries: 5 ``` --- ## Docs (MkDocs) `mkdocs.yml` with: - `site_name: Work Queue API` - `repo_url: https://git.danhenry.dev/thelab/work-queue-api` - Nav structure: Getting Started, API Reference, Docker Compose `docs/index.md` must include: - Overview - Docker Compose full example (the postgres version) - Quick start - API endpoint reference - Status lifecycle diagram --- ## Skills to build after API is done For Marcus (main dispatcher): - `work-queue` skill: thin wrapper around HTTP calls - `work_add(type, description, payload, agent, project_id?)` → POST /work - `work_dispatch(work_item_id, agent)` → PATCH status=dispatched+in_progress - `work_update(work_item_id, status, outcome?, notes?)` → PATCH - `work_list(status?, agent?, project_id?)` → GET /work - `work_stale_check()` → poll in_progress, timeout stale items For Steve's agent: - Polling skill: every N minutes, GET /work?agent=steve-w&status=dispatched, pick up items