Reference · Webhooks

Webhook deliveries

The wire contract your server receives when an extraction job finishes.

Two ways to receive webhooks

  • Per-request callback. Pass options.callback_url on POST /v1/extract. One-shot delivery for that job only.
  • Registered endpoint. Add an endpoint in Dashboard → Webhooks. Every job your customer runs fans out to every active endpoint. Each endpoint has its own signing secret you can rotate.

Request

POST to your URL with JSON body and these headers:

bash
POST /your/webhook HTTP/1.1
Content-Type: application/json
User-Agent: OCRQueen-Webhooks/1.0
OCRQueen-Event: extraction.completed
OCRQueen-Signature: sha256=4f8a1c…
OCRQueen-Delivery-Id: d_5f8a…
OCRQueen-Attempt: 1
HeaderNotes
OCRQueen-Eventextraction.completed, extraction.failed, or webhook.test.
OCRQueen-SignatureHMAC-SHA256 of the raw body, hex-encoded, prefixed sha256=.
OCRQueen-Delivery-IdUnique per delivery attempt set. Stable across retries.
OCRQueen-Attempt1-indexed. 1 = first try.

Payload

JSON body, canonicalized: sort_keys=true with no whitespace, so the bytes we sign are the bytes you receive.

json
{
  "event_id": "evt_5f8a...",
  "event_type": "extraction.completed",
  "timestamp": "2026-05-14T12:00:08Z",
  "job_id": "5f8a...",
  "data": {
    "document": { /* full DigitisedDocument — same shape as GET /v1/jobs/{id}.document */ }
  }
}

document has the exact shape documented under GET /v1/jobs/{id} — including document.markdown.

For extraction.failed, data instead carries the error:

json
{
  "event_id": "evt_...",
  "event_type": "extraction.failed",
  "timestamp": "...",
  "job_id": "...",
  "data": {
    "error": {
      "code": "PDF_PASSWORD_PROTECTED",
      "message": "Source PDF requires a password we don't have."
    }
  }
}

For the synthetic webhook.test event fired from the dashboard, job_id is absent and data = { message, endpoint_id }. Use this to verify signature checking + reachability without burning credits.

Verifying the signature

Verify before you trust.Sign over the raw bytes of the request body (don't re-serialize JSON — whitespace differences break HMAC). Use a constant-time comparator.

Node

javascript
import crypto from "crypto";

function verify(rawBody, signatureHeader, secret) {
  const expected =
    "sha256=" +
    crypto.createHmac("sha256", secret).update(rawBody).digest("hex");
  // timingSafeEqual is constant-time; protects against timing attacks
  return crypto.timingSafeEqual(
    Buffer.from(expected),
    Buffer.from(signatureHeader),
  );
}

Python

python
import hmac, hashlib

def verify(raw_body: bytes, sig_header: str, secret: str) -> bool:
    digest = hmac.new(secret.encode(), raw_body, hashlib.sha256).hexdigest()
    expected = f"sha256={digest}"
    return hmac.compare_digest(expected, sig_header)

Where to get secret: for registered endpoints, copy it when you create the endpoint (or rotate it from the dashboard — the new secret is shown once). For callback_url, the per-customer signing secret lives on your account; see Settings.

Retry schedule

Any non-2xx response, timeout, or network error counts as a failed attempt. We retry up to 5 total attempts with these delays:

AttemptDelay after previous
1
230 seconds
32 minutes
410 minutes
530 minutes

After attempt 5 fails the delivery is marked failed and we stop. You can manually retry any delivery from Dashboard → Webhooks (which resets the counter). Request timeout is 10 seconds — return 2xx quickly and do heavy work asynchronously on your side.

Idempotency on your side

Use OCRQueen-Delivery-Id (or the body's event_id) as your idempotency key. Both are stable across retry attempts — same ID in your DB = same delivery.