Files
work-queue-api/SPEC.md
2026-04-11 13:29:04 -05:00

8.8 KiB

Work Queue API — Specification

Overview

A lightweight internal API that tracks the full lifecycle of work items across TheLab agents. Marcus A. (main dispatcher) submits work; agents poll for their queue and update status; Marcus monitors for exceptions.


Database Schema (SQLite)

Table: projects

Column Type Notes
id TEXT PRIMARY KEY UUID
name TEXT NOT NULL Human-readable project name
external_ref TEXT Todoist project ID, GitHub repo, etc. (optional)
created_at TEXT ISO8601
updated_at TEXT ISO8601

Table: work_items

Column Type Notes
id TEXT PRIMARY KEY UUID
project_id TEXT FK References projects.id (optional)
type TEXT NOT NULL e.g. code_review, bug_fix, infra_setup, gitea_issue
description TEXT NOT NULL Human-readable summary
payload TEXT JSON blob with type-specific fields
priority INTEGER 1-5, 1=highest. Default 3
status TEXT NOT NULL See Status Lifecycle below
assigned_agent TEXT e.g. steve-w. NULL until dispatched
created_by TEXT e.g. marcus-a, gitea-watcher, bms-ticket-workflow
created_at TEXT ISO8601
updated_at TEXT ISO8601
completed_at TEXT ISO8601, set when status → completed/failed/cancelled
outcome TEXT success, failed, cancelled, NULL
notes TEXT Agent-added notes, URLs, context

Status Lifecycle

queued → dispatched → in_progress → completed
                              ↘ blocked
                              ↘ failed
           ↘ cancelled (from queued or dispatched only)
  • queued — New work, waiting for Marcus to dispatch
  • dispatched — Marcus has assigned to an agent (agent has not yet picked it up)
  • in_progress — Agent acknowledged and is working it
  • blocked — Agent hit a holding condition (waiting on external input, dependencies, etc.)
  • failed — Agent attempted but hit an unrecoverable error
  • completed — Agent finished successfully; Marcus reviews before marking truly done
  • cancelled — Marcus killed it before work started

Table: dispatch_log

Column Type Notes
id TEXT PRIMARY KEY UUID
work_item_id TEXT FK References work_items.id
dispatched_at TEXT ISO8601
agent TEXT Which agent it was dispatched to
completed_at TEXT ISO8601, when status reached terminal state
outcome TEXT success, failed, cancelled

Constraints

  • One in_progress work item per agent at any time (enforced via DB constraint or application logic)
  • completed_at and outcome can only be set when status is terminal (completed, failed, cancelled)

API Endpoints

Projects

POST /projects

// Request
{ "name": "Shopping List API", "external_ref": "todoist:123" }

// Response 201
{ "id": "uuid", "name": "Shopping List API", "external_ref": "todoist:123", "created_at": "...", "updated_at": "..." }

GET /projects — list all projects GET /projects/:id — get project PATCH /projects/:id — update name or external_ref


Work Items

POST /work

// Request
{
  "project_id": "uuid",           // optional
  "type": "code_review",
  "description": "Review PR #3 in shopping-list-api",
  "payload": { "pr": 3, "repo": "shopping-list-api" },
  "priority": 2,
  "assigned_agent": "steve-w"
}

// Response 201
{ "id": "uuid", "status": "queued", "created_at": "...", ... }

Note: assigned_agent is accepted on POST but item is still created as queued. Marcus must call PATCH /work/:id with status=dispatched to actually dispatch. (This allows Marcus to set up all fields before pulling the trigger.)

GET /work — list work items. Supports filters:

  • ?status=queued — pending work
  • ?status=in_progress — active work
  • ?status=blocked — needs intervention
  • ?agent=steve-w&status=queued — Steve's pending queue
  • ?project_id=uuid — items in a project
  • ?since=ISO8601 — created after timestamp

Sort order: priority ASC, created_at ASC (unless since is used, then created_at ASC)

GET /work/:id — get single work item with dispatch history

PATCH /work/:id

// Request — one or more fields
{
  "status": "in_progress",       // queued→dispatched→in_progress, or in_progress→blocked
  "outcome": "success",           // set when moving to completed/failed/cancelled
  "notes": "Reviewed and approved, merged to main",
  "assigned_agent": "steve-w"    // required to move from queued→dispatched
}

// Response 200 — updated work item

Special transitions:

  • queued → dispatched requires assigned_agent to be set
  • dispatched → in_progress is reserved for Marcus (or a heartbeat safety net) to confirm agent picked it up
  • in_progress → completed or in_progress → failed requires outcome
  • blocked should include a notes explanation

DELETE /work/:id — alias for PATCH /work/:id with status=cancelled. Returns 204.


Monitoring (for Marcus heartbeat)

GET /work?status=in_progress — what's being worked on right now GET /work?status=blocked — items that need intervention GET /work?status=failed — items that need review GET /work?status=completed&since=<ts> — completed since last check (for notifications)


Dispatch Flow

Marcus dispatches to Steve:

  1. POST /work with Steve's queue item → status=queued
  2. PATCH /work/:id with status=dispatched, assigned_agent=steve-w
  3. (optional) Immediately PATCH /work/:id with status=in_progress to mark Steve has picked it up

Steve picks up his queue:

GET /work?agent=steve-w&status=dispatched
→ for each item: PATCH /work/:id with status=in_progress
→ work the task
→ PATCH /work/:id with status=completed, outcome=success, notes=...

Marcus heartbeat checks:

GET /work?status=blocked        → alert Daniel if anything new
GET /work?status=failed         → alert Daniel if anything new
GET /work?status=completed&since=<ts> → notify Daniel of completions
GET /work?status=in_progress    → detect stale items (>30min → flag blocked)

Stale Task Detection

Marcus's heartbeat checks in_progress items on every run. If any item has updated_at older than 30 minutes and no notes update, Marcus marks it blocked with a note about the timeout and alerts Daniel.


Skills / Integration

Marcus (main dispatcher)

A work-queue skill for Marcus: thin wrapper around HTTP calls to the API. Methods:

  • work_add(type, description, payload, agent, project_id?) → POST /work
  • work_dispatch(work_item_id, agent) → PATCH status=dispatched+in_progress
  • work_update(work_item_id, status, outcome?, notes?) → PATCH
  • work_list(status?, agent?, project_id?) → GET /work
  • work_stale_check() → poll in_progress, timeout stale items

Steve's Agent

Steve's agent uses a polling loop:

every N minutes:
  GET /work?agent=steve-w&status=dispatched
  for each item:
    PATCH /work/:id with status=in_progress
    do work
    PATCH /work/:id with status=completed, outcome=success, notes=result

Project Structure

work-queue-api/
├── SPEC.md
├── Dockerfile
├── docker-compose.yml          ← includes PostgreSQL (production) / SQLite dev option
├── ci.yml                      ← GitHub Actions: build + push container
├── internal/
│   ├── api/
│   │   ├── server.go
│   │   ├── handlers_work.go
│   │   ├── handlers_projects.go
│   │   └── middleware.go
│   ├── db/
│   │   ├── migrations/
│   │   │   └── 001_initial.sql
│   │   └── sqlite.go
│   └── model/
│       └── models.go
└── README.md

CI/CD

GitHub Actions workflow (ci.yml):

  1. Build Docker container on push to main
  2. Push to git.danhenry.dev/thelab/work-queue-api:latest
  3. Tag with git short SHA

Docker Compose Example

version: '3.8'
services:
  work-queue-api:
    image: git.danhenry.dev/thelab/work-queue-api:latest
    ports:
      - "8080:8080"
    environment:
      - DATABASE_URL=/data/work_queue.db
      - PORT=8080
    volumes:
      - ./data:/data
    restart: unless-stopped
    healthcheck:
      test: ["CMD", "wget", "-qO-", "http://localhost:8080/health"]
      interval: 30s
      timeout: 10s
      retries: 3

Notes for Implementation

  • Use Go with a lightweight router (chi or gin)
  • Use net/http with SQLite via mattn/go-sqlite3
  • No auth required (internal network only)
  • assigned_agent uniqueness on in_progress should be enforced in application logic (SQLite lacks proper constraint for cross-row conditions)
  • dispatch_log table is append-only; used for audit trail and staleness detection