ece1b037d16c985089cd033152fecf1b6932f98f
notebook-tools
FastAPI service that:
- downloads PDFs from Paperless-ngx
- splits them into pages (JPEG)
- OCRs each page via your llama.cpp OpenAI-compatible endpoint
- converts each page back into a single-page PDF
- uploads one Paperless document per page
- patches each uploaded document with:
content= OCR text- custom fields
notebook_id(field id 1) andnotebook_page(field id 2) document_type= Paperless document type id (default 3, configurable)
Setup
Install deps:
uv sync
Create a .env file (example below) and do not commit it.
Run locally
uv run uvicorn notebook_tools.api:app --reload --host 0.0.0.0 --port 8080
Then open the docs at:
http://127.0.0.1:8080/docs(same machine)http://<your-lan-ip>:8080/docs(other machines on your network)
If other machines still can’t connect, check your macOS firewall and any router/network rules.
Example .env
PAPERLESS_BASE_URL="https://paperless.example.com"
PAPERLESS_TOKEN="paste-token-here"
LLAMA_BASE_URL="http://127.0.0.1:9292"
LLAMA_MODEL="ggml-model-q4_k_m"
# Custom field ids in Paperless
PAPERLESS_CUSTOM_FIELD_NOTEBOOK_ID=1
PAPERLESS_CUSTOM_FIELD_NOTEBOOK_PAGE=2
PAPERLESS_DOCUMENT_TYPE_ID=3
# Rendering / OCR knobs
RENDER_DPI=200
OCR_MAX_TOKENS=1024
OCR_TEMPERATURE=0.0
Description
Languages
Python
98.9%
Dockerfile
1.1%