Webhook reliability at scale: the patterns that stopped 3am pages

Webhooks fail. The question is how loudly. Five patterns we apply to every client workflow to keep silent failures from being silent.

LUMIENApril 24, 2026Updated June 19, 20262 min read

Webhook reliability at scale: the patterns that stopped 3am pages

Half of “the automation broke” incidents are webhooks that quietly failed weeks ago. By the time anyone notices, the data backlog is hours of recovery work. These five patterns are what we apply by default on every client workflow.

1. Idempotency keys on every receiver

Webhooks retry. If your downstream effect is “create CRM record”, you will create duplicates on retry. Use the webhook delivery ID as an idempotency key in your receiver; reject duplicates with 200 (so the sender stops retrying).

2. Dead-letter queue for failed deliveries

Any failed webhook should land in a DLQ (SQS, Redis list, even a Postgres table). Reprocess on a schedule. Never silently drop.

3. Signature verification before any work

Validate HMAC signature first. Webhooks without auth get spoofed; we have seen this in the wild.

4. Health monitor that catches “no events”

An “is the webhook working?” check is not “did the last call succeed?” It is “have we received an event in the last expected window?” If your store gets an order every 10 minutes, alert when nothing arrives for 30.

5. Replay endpoint, not a panic restore

Build an endpoint that re-emits webhook deliveries for a date range. When the worst happens (whole receiver was down), you replay; you do not write a one-off script under pressure.

The 3am page

One client (e-commerce, 400 orders/day) was missing 8% of orders silently for a month. Patterns 2 + 4 above would have alerted on day one. Now they do.

We ship these patterns by default on every workflow automation engagement.

More from Automation

Automation

Webhook reliability at scale: the patterns that stopped 3am pages

Webhooks fail. The question is how loudly. Five patterns we apply to every client workflow to keep silent failures from being silent.

LUMIENApril 24, 2026Updated June 19, 20262 min read

1. Idempotency keys on every receiver

2. Dead-letter queue for failed deliveries

Any failed webhook should land in a DLQ (SQS, Redis list, even a Postgres table). Reprocess on a schedule. Never silently drop.

3. Signature verification before any work

Validate HMAC signature first. Webhooks without auth get spoofed; we have seen this in the wild.

4. Health monitor that catches “no events”

5. Replay endpoint, not a panic restore

Build an endpoint that re-emits webhook deliveries for a date range. When the worst happens (whole receiver was down), you replay; you do not write a one-off script under pressure.

The 3am page

One client (e-commerce, 400 orders/day) was missing 8% of orders silently for a month. Patterns 2 + 4 above would have alerted on day one. Now they do.

We ship these patterns by default on every workflow automation engagement.

Webhook reliability at scale: the patterns that stopped 3am pages

1. Idempotency keys on every receiver

2. Dead-letter queue for failed deliveries

3. Signature verification before any work

4. Health monitor that catches “no events”

5. Replay endpoint, not a panic restore

The 3am page

More from Automation

n8n vs Make in 2026: head-to-head on the eight things that matter

AI lead qualification: the pipeline pattern that 5x’d SDR productivity

Five CRM setup mistakes most teams make (and the cost of each)

Webhook reliability at scale: the patterns that stopped 3am pages

1. Idempotency keys on every receiver

2. Dead-letter queue for failed deliveries

3. Signature verification before any work

4. Health monitor that catches “no events”

5. Replay endpoint, not a panic restore

The 3am page

More from Automation

n8n vs Make in 2026: head-to-head on the eight things that matter

AI lead qualification: the pipeline pattern that 5x’d SDR productivity

Five CRM setup mistakes most teams make (and the cost of each)

1. Idempotency keys on every receiver

2. Dead-letter queue for failed deliveries

3. Signature verification before any work

4. Health monitor that catches “no events”

5. Replay endpoint, not a panic restore

The 3am page

More from Automation

n8n vs Make in 2026: head-to-head on the eight things that matter

AI lead qualification: the pipeline pattern that 5x&#8217;d SDR productivity

Five CRM setup mistakes most teams make (and the cost of each)

1. Idempotency keys on every receiver

2. Dead-letter queue for failed deliveries

3. Signature verification before any work

4. Health monitor that catches “no events”

5. Replay endpoint, not a panic restore

The 3am page

More from Automation

n8n vs Make in 2026: head-to-head on the eight things that matter

AI lead qualification: the pipeline pattern that 5x&#8217;d SDR productivity

Five CRM setup mistakes most teams make (and the cost of each)

AI lead qualification: the pipeline pattern that 5x’d SDR productivity

AI lead qualification: the pipeline pattern that 5x’d SDR productivity