Reconciling Stripe Webhooks Per Tenant
A Stripe webhook arrives with Stripe's identifiers, not yours, so every handler must answer one question before it does anything else: which tenant does this event belong to? This guide sits within Billing Sync with Stripe and shows how to resolve a customer or subscription back to a tenant, verify the signature, process the event exactly once even when it arrives twice or out of order, and catch the drift that webhooks inevitably miss with a daily reconciliation job.
Problem Framing
Stripe is the system of record for money; your application is the system of record for tenants. A webhook is Stripe telling you that money state changed — a payment succeeded, a subscription was canceled, an invoice was finalized. The event payload carries cus_..., sub_..., and in_... identifiers, but never your tenant_id. If you map the event to the wrong tenant, you upgrade the wrong account, revoke the wrong customer's access, or credit usage to a tenant that never incurred it. If you process the same event twice, you double-apply a state change. If you process events in the order they arrive rather than the order they occurred, a stale customer.subscription.updated can overwrite a newer cancellation.
Four things break in practice. First, trusting the payload's identifiers without verifying the signature lets an attacker forge a payment_intent.succeeded and unlock a paid plan for free. Second, looking up the tenant by the wrong key — joining on a Stripe customer email instead of the customer ID — breaks the moment two tenants share an email or a customer changes theirs. Third, non-idempotent handlers double-apply because Stripe guarantees at-least-once delivery and will retry any non-2xx response. Fourth, out-of-order events corrupt state because Stripe does not guarantee ordering; the created timestamp on the resource, not the delivery time, is the source of truth.
The flow below shows the handler path that keeps these boundaries intact: verify the signature first, resolve the tenant from a stored ID mapping, dedupe on the event ID, then apply the change only if the event is newer than the last one recorded for that resource.
Step-by-Step Guide
1. Verify the signature on the raw request body
Stripe signs every webhook with your endpoint's signing secret. Verification must run against the unparsed body — any JSON middleware that mutates bytes invalidates the HMAC. Use stripe.webhooks.constructEvent, which checks both the signature and a five-minute timestamp tolerance to block replay.
import Stripe from "stripe";
const stripe = new Stripe(process.env.STRIPE_SECRET_KEY!);
const endpointSecret = process.env.STRIPE_WEBHOOK_SECRET!;
export function parseEvent(rawBody: Buffer, signature: string): Stripe.Event {
// rawBody MUST be the unmodified bytes; constructEvent verifies the HMAC.
return stripe.webhooks.constructEvent(rawBody, signature, endpointSecret);
}
2. Resolve the tenant from a stored ID mapping
Never derive the tenant from email or metadata you cannot trust. When you first create a Stripe customer you persist the tenant_id ↔ customer_id link; the webhook handler reads that link back. Index the table on stripe_customer_id and stripe_subscription_id so resolution is a single lookup.
async function resolveTenant(event: Stripe.Event): Promise<string> {
const obj = event.data.object as { customer?: string; id?: string };
// Most events expose `customer`; subscription events also carry their own id.
const customerId = obj.customer ?? obj.id;
const row = await db.oneOrNone(
`SELECT tenant_id FROM stripe_links WHERE stripe_customer_id = $1`,
[customerId],
);
if (!row) throw new Error(`no tenant for customer ${customerId}`);
return row.tenant_id;
}
3. Make the handler idempotent on the event ID
Stripe delivers at least once and retries on any non-2xx. Record every event.id in a uniquely-constrained table inside the same transaction that applies the change. A duplicate delivery hits the unique violation, you swallow it, and return 200 — applying nothing twice.
async function processOnce(event: Stripe.Event, tenantId: string): Promise<void> {
await db.tx(async (t) => {
const inserted = await t.oneOrNone(
`INSERT INTO processed_events (event_id, tenant_id)
VALUES ($1, $2) ON CONFLICT (event_id) DO NOTHING
RETURNING event_id`,
[event.id, tenantId],
);
if (!inserted) return; // already processed; no-op, still ack 200
await applyEvent(t, event, tenantId);
});
}
4. Apply events in resource order, not delivery order
Stripe does not guarantee ordering. Compare the resource's own timestamp against the last applied timestamp for that subscription and skip anything older. This keeps a late subscription.updated from clobbering a newer subscription.deleted. Mirror the same exactly-once discipline you use for idempotent usage event ingestion on the metering side.
async function applyEvent(t: any, event: Stripe.Event, tenantId: string): Promise<void> {
const sub = event.data.object as Stripe.Subscription;
const eventTs = sub.created; // resource timestamp, not delivery time
// Conditional write: only advance if this event is newer than what we hold.
await t.none(
`UPDATE subscriptions
SET status = $1, last_event_ts = $2
WHERE tenant_id = $3 AND stripe_subscription_id = $4
AND last_event_ts < $2`,
[sub.status, eventTs, tenantId, sub.id],
);
}
5. Acknowledge fast, do heavy work async
Return 200 within Stripe's timeout (a few seconds) or it marks the delivery failed and retries, multiplying load. Verify, resolve, dedupe, and enqueue — then ack. Do downstream fan-out (provisioning, email, audit) from the queue worker, where retries are yours to control.
app.post("/webhooks/stripe", express.raw({ type: "application/json" }),
async (req, res) => {
let event: Stripe.Event;
try {
event = parseEvent(req.body, req.header("stripe-signature")!);
} catch {
return res.status(400).send("invalid signature");
}
const tenantId = await resolveTenant(event);
await processOnce(event, tenantId);
await queue.enqueue("stripe.fanout", { eventId: event.id, tenantId });
res.status(200).send("ok"); // ack before fan-out work
});
6. Run a daily reconciliation job
Webhooks get dropped, endpoints have outages, and bugs skip events. A daily job lists each tenant's current Stripe subscription via the API and corrects any local row that disagrees. This is the safety net that turns "mostly consistent" into "provably consistent."
async function reconcileTenant(tenantId: string, customerId: string): Promise<void> {
const subs = await stripe.subscriptions.list({ customer: customerId, limit: 10 });
for (const sub of subs.data) {
const local = await db.oneOrNone(
`SELECT status FROM subscriptions
WHERE tenant_id = $1 AND stripe_subscription_id = $2`,
[tenantId, sub.id],
);
if (!local || local.status !== sub.status) {
await db.none(
`INSERT INTO subscriptions (tenant_id, stripe_subscription_id, status, last_event_ts)
VALUES ($1, $2, $3, $4)
ON CONFLICT (stripe_subscription_id)
DO UPDATE SET status = EXCLUDED.status, last_event_ts = EXCLUDED.last_event_ts`,
[tenantId, sub.id, sub.status, sub.created],
);
await recordDrift(tenantId, sub.id, local?.status ?? null, sub.status);
}
}
}
Verification
Replay a known event with the Stripe CLI and confirm the handler is idempotent and tenant-correct. The first delivery applies the change; the second is a no-op that still returns 200.
# Trigger a real signed test event against your local endpoint
stripe listen --forward-to localhost:3000/webhooks/stripe
stripe trigger customer.subscription.updated
# Replay the same event id twice; the second must change nothing
stripe events resend evt_1Abc... --webhook-endpoint we_1Xyz...
Assert the contract in tests: a duplicate event ID is swallowed, an out-of-order event is ignored, and an unsigned request is rejected.
test("duplicate event id is processed at most once", async () => {
await processOnce(event, "acme");
await processOnce(event, "acme"); // retry
const count = await db.one(
`SELECT count(*)::int FROM audit_log WHERE event_id = $1`, [event.id]);
expect(count.count).toBe(1);
});
test("stale event does not overwrite newer state", async () => {
await applyEvent(t, newerCancelEvent, "acme"); // status=canceled
await applyEvent(t, olderUpdateEvent, "acme"); // arrives late
const row = await db.one(
`SELECT status FROM subscriptions WHERE tenant_id = 'acme'`);
expect(row.status).toBe("canceled");
});
Every correction the reconciliation job makes should be written to your tenant audit logging architecture so a recurring drift between Stripe and your database becomes a visible, investigable signal rather than silent revenue leakage.
Failure Modes & Gotchas
- Signature verification always fails. Symptom: every webhook returns 400. Root cause: a body-parsing middleware ran before the handler and mutated the raw bytes. Fix: register a raw-body parser scoped to the webhook route only, before any JSON parser.
- Wrong tenant gets upgraded. Symptom: a payment unlocks features for an account that never paid. Root cause: tenant resolved by email or untrusted metadata. Fix: resolve only via the stored
customer_id ↔ tenant_idlink, indexed and unique. - State flickers between active and canceled. Symptom: a subscription oscillates as retries arrive. Root cause: applying events in delivery order. Fix: gate every write on
last_event_ts < new_tsusing the resource timestamp. - Reconciliation never converges. Symptom: the daily job rewrites the same rows every run. Root cause: comparing fields Stripe normalizes differently (cents vs. dollars, enum casing). Fix: compare the exact Stripe-returned values and store them verbatim.
FAQ
Why not put the tenant_id in Stripe metadata and read it from the webhook?
You can set metadata, but treat it as a hint, not a source of truth — metadata can be edited in the Stripe dashboard and is not present on every event type. Always confirm against your own indexed customer_id ↔ tenant_id mapping so a stale or missing metadata field cannot misroute money state.
Do I still need reconciliation if my webhook handler is correct? Yes. Webhooks are best-effort: endpoints have outages, deploys drop in-flight deliveries, and Stripe eventually stops retrying. A daily list-and-compare job against the Stripe API is the only thing that guarantees your local state converges to Stripe's, and it surfaces silent drift you would otherwise never see.
How do I handle events for a customer my application has never seen?
Reject and alert rather than guess. If resolveTenant finds no mapping, return a 200 to stop retries but log the orphaned customer_id for investigation — it usually means a customer was created out-of-band in the dashboard, and silently dropping it hides a real gap.