Webhooks fail. The question is how loudly. Five patterns we apply to every client workflow to keep silent failures from being silent.
Half of “the automation broke” incidents are webhooks that quietly failed weeks ago. By the time anyone notices, the data backlog is hours of recovery work. These five patterns are what we apply by default on every client workflow.
Webhooks retry. If your downstream effect is “create CRM record”, you will create duplicates on retry. Use the webhook delivery ID as an idempotency key in your receiver; reject duplicates with 200 (so the sender stops retrying).
Any failed webhook should land in a DLQ (SQS, Redis list, even a Postgres table). Reprocess on a schedule. Never silently drop.
Validate HMAC signature first. Webhooks without auth get spoofed; we have seen this in the wild.
An “is the webhook working?” check is not “did the last call succeed?” It is “have we received an event in the last expected window?” If your store gets an order every 10 minutes, alert when nothing arrives for 30.
Build an endpoint that re-emits webhook deliveries for a date range. When the worst happens (whole receiver was down), you replay; you do not write a one-off script under pressure.
One client (e-commerce, 400 orders/day) was missing 8% of orders silently for a month. Patterns 2 + 4 above would have alerted on day one. Now they do.
We ship these patterns by default on every workflow automation engagement.