Reference · Webhooks
Webhook deliveries
The wire contract your server receives when an extraction job finishes.
Two ways to receive webhooks
- Per-request callback. Pass
options.callback_urlon POST /v1/extract. One-shot delivery for that job only. - Registered endpoint. Add an endpoint in Dashboard → Webhooks. Every job your customer runs fans out to every active endpoint. Each endpoint has its own signing secret you can rotate.
Request
POST to your URL with JSON body and these headers:
POST /your/webhook HTTP/1.1
Content-Type: application/json
User-Agent: OCRQueen-Webhooks/1.0
OCRQueen-Event: extraction.completed
OCRQueen-Signature: sha256=4f8a1c…
OCRQueen-Delivery-Id: d_5f8a…
OCRQueen-Attempt: 1| Header | Notes |
|---|---|
OCRQueen-Event | extraction.completed, extraction.failed, or webhook.test. |
OCRQueen-Signature | HMAC-SHA256 of the raw body, hex-encoded, prefixed sha256=. |
OCRQueen-Delivery-Id | Unique per delivery attempt set. Stable across retries. |
OCRQueen-Attempt | 1-indexed. 1 = first try. |
Payload
JSON body, canonicalized: sort_keys=true with no whitespace, so the bytes we sign are the bytes you receive.
{
"event_id": "evt_5f8a...",
"event_type": "extraction.completed",
"timestamp": "2026-05-14T12:00:08Z",
"job_id": "5f8a...",
"data": {
"document": { /* full DigitisedDocument — same shape as GET /v1/jobs/{id}.document */ }
}
}document has the exact shape documented under GET /v1/jobs/{id} — including document.markdown.
For extraction.failed, data instead carries the error:
{
"event_id": "evt_...",
"event_type": "extraction.failed",
"timestamp": "...",
"job_id": "...",
"data": {
"error": {
"code": "PDF_PASSWORD_PROTECTED",
"message": "Source PDF requires a password we don't have."
}
}
}For the synthetic webhook.test event fired from the dashboard, job_id is absent and data = { message, endpoint_id }. Use this to verify signature checking + reachability without burning credits.
Verifying the signature
Verify before you trust.Sign over the raw bytes of the request body (don't re-serialize JSON — whitespace differences break HMAC). Use a constant-time comparator.
Node
import crypto from "crypto";
function verify(rawBody, signatureHeader, secret) {
const expected =
"sha256=" +
crypto.createHmac("sha256", secret).update(rawBody).digest("hex");
// timingSafeEqual is constant-time; protects against timing attacks
return crypto.timingSafeEqual(
Buffer.from(expected),
Buffer.from(signatureHeader),
);
}Python
import hmac, hashlib
def verify(raw_body: bytes, sig_header: str, secret: str) -> bool:
digest = hmac.new(secret.encode(), raw_body, hashlib.sha256).hexdigest()
expected = f"sha256={digest}"
return hmac.compare_digest(expected, sig_header)Where to get secret: for registered endpoints, copy it when you create the endpoint (or rotate it from the dashboard — the new secret is shown once). For callback_url, the per-customer signing secret lives on your account; see Settings.
Retry schedule
Any non-2xx response, timeout, or network error counts as a failed attempt. We retry up to 5 total attempts with these delays:
| Attempt | Delay after previous |
|---|---|
| 1 | — |
| 2 | 30 seconds |
| 3 | 2 minutes |
| 4 | 10 minutes |
| 5 | 30 minutes |
After attempt 5 fails the delivery is marked failed and we stop. You can manually retry any delivery from Dashboard → Webhooks (which resets the counter). Request timeout is 10 seconds — return 2xx quickly and do heavy work asynchronously on your side.
Idempotency on your side
Use OCRQueen-Delivery-Id (or the body's event_id) as your idempotency key. Both are stable across retry attempts — same ID in your DB = same delivery.
