How I Would Fix webhooks failing silently in a Flutter and Firebase paid acquisition funnel Using Launch Ready.
The symptom is usually ugly in the same way every time: a user pays, the app says 'success', but the backend never marks the order complete, never unlocks...
How I Would Fix webhooks failing silently in a Flutter and Firebase paid acquisition funnel Using Launch Ready
The symptom is usually ugly in the same way every time: a user pays, the app says "success", but the backend never marks the order complete, never unlocks access, or never sends the onboarding email. In a paid acquisition funnel, that means lost revenue, support tickets, and users thinking your product is broken.
The most likely root cause is not "Flutter" itself. It is usually one of these: the webhook endpoint is misconfigured, Firebase Cloud Functions are not logging failures clearly, the payment provider cannot reach your endpoint, or the handler returns 200 before it actually finishes work. The first thing I would inspect is the full request path from payment event to Firebase write: provider dashboard, Cloud Function logs, Firestore writes, and any retry or dead-letter behavior.
Triage in the First Hour
1. Check the payment provider event log.
- Look for delivered, failed, retried, or pending webhook events.
- Confirm whether the provider got a 2xx response or timed out.
2. Open Firebase logs for the exact function.
- In Google Cloud Logging, filter by function name and timestamp.
- Look for unhandled exceptions, cold start delays, auth failures, and permission errors.
3. Verify the webhook endpoint URL.
- Confirm it points to production, not staging.
- Check for trailing slash mismatches, wrong region paths, or old deployment URLs.
4. Inspect Firestore or Realtime Database writes.
- Confirm the expected document was created or updated.
- Check whether security rules blocked the write.
5. Review environment variables and secrets.
- Confirm signing secret, API keys, and project IDs are present in production only.
- Make sure no secret was rotated without redeploying.
6. Test DNS and Cloudflare routing.
- Confirm Cloudflare is not caching webhook responses.
- Verify SSL mode is correct and origin certs are valid.
7. Check retries and idempotency.
- Confirm duplicate events are handled safely.
- Look for logic that drops repeated webhook IDs without processing them.
8. Inspect Flutter app assumptions.
- Confirm the app does not assume immediate fulfillment after payment.
- Make sure it polls status or listens to backend state instead of trusting client-side success alone.
firebase functions:log --only processWebhook
If that command shows nothing while payments are being made, I assume routing or deployment is broken before I assume business logic is broken.
Root Causes
| Likely cause | How to confirm | Business impact | |---|---|---| | Wrong webhook URL or environment | Compare provider endpoint with deployed function URL | Events go nowhere | | Signature verification failing | Logs show invalid signature or missing secret | All events rejected | | Function returns early | Logs show 200 before Firestore write completes | Silent data loss | | Firestore rules block write | Logs show permission denied | Funnel stops after payment | | Cloudflare caching or proxy issue | Provider retries with no origin hit | Webhook never reaches backend | | Missing retries or idempotency | Duplicate events create confusion or skipped updates | Random failures and support load |
1. Wrong webhook URL or wrong environment.
- I confirm this by comparing the exact endpoint in Stripe, Paddle, Lemon Squeezy, or whichever provider you use against the deployed Firebase function URL.
- If staging and production share similar names, this gets missed often during launch rushes.
2. Signature verification failing silently.
- I check whether raw request bodies are preserved before parsing.
- Many providers require raw payload verification. If JSON parsing happens first, signatures fail even though the payload looks valid.
3. The function exits before async work finishes.
- I look for `await` missing on database writes or email calls.
- This creates a false success response while downstream tasks never complete.
4. Firestore security rules block server writes.
- I confirm whether the Admin SDK is used correctly from Cloud Functions.
- If someone accidentally uses client SDK patterns on the server side, rules can block writes or create inconsistent access behavior.
5. Cloudflare proxying breaks delivery.
- I check whether webhook routes are behind aggressive caching rules or bot protection settings.
- Webhooks should be treated as API traffic with low friction and clear allowlists where possible.
6. Missing observability and retry strategy.
- I confirm there is no structured logging for event ID, user ID, payment intent ID, and fulfillment state.
- Without those fields, silent failure becomes expensive guesswork.
The Fix Plan
My goal is to repair this without making a bigger mess. I would keep changes small: isolate webhook handling into one path, make it idempotent, log every step with an event ID, and only return success after durable storage confirms completion.
1. Freeze non-essential changes for 24 hours.
- No UI polish work until payment fulfillment is stable.
- A broken funnel burns ad spend faster than a delayed feature release.
2. Add structured logging around every webhook step.
- Log receipt of event ID, signature result, parsing result, database write result, and final response code.
- Keep logs free of secrets and customer PII.
3. Validate raw body handling in Firebase Functions.
- Ensure signature verification uses raw payload when required by your provider.
- If needed, separate verification from business logic so parsing cannot corrupt validation.
4. Make fulfillment idempotent.
- Store provider event ID in Firestore before processing business actions.
- Reject duplicates safely with a known processed state instead of re-running side effects.
5. Move critical side effects behind confirmed persistence.
- First write an internal fulfillment record with status "pending".
- Then send emails or unlock access only after that record exists successfully.
6. Add explicit error handling and retries.
- Catch expected errors separately from unknown ones.
- Return non-2xx only when you want the provider to retry; otherwise mark as handled and queue follow-up work internally.
7. Use a queue if work is more than trivial.
- If fulfillment includes email delivery, CRM syncs, entitlement changes, and analytics calls, do not do everything inline in one request path if latency grows beyond 500 ms to 1 s p95 under load.
- A queue reduces timeout risk and makes failures visible instead of silent.
8. Lock down secrets properly.
- Put signing secrets in Firebase environment config or Secret Manager only.
- Rotate any exposed key immediately if it was ever committed to GitHub or shared in chat tools.
A simple defensive shape looks like this:
if (!eventId) throw new Error("Missing event id");
if (await alreadyProcessed(eventId)) return res.status(200).send("duplicate");
await savePendingFulfillment(eventId);
await verifySignature(req);
await applyEntitlementChange();
await markFulfilled(eventId);
return res.status(200).send("ok");That order matters because it turns silent failure into visible state transitions you can debug quickly.
Regression Tests Before Redeploy
I would not redeploy this fix without proving three things: valid events succeed once only once once duplicates are harmlessly ignored; invalid events fail loudly; failed downstream steps leave an audit trail that support can see within minutes.
Acceptance criteria:
1. Valid webhook test passes end to end within 5 seconds p95 in staging. 2. Duplicate delivery of the same event does not create duplicate entitlements or duplicate emails. 3. Invalid signature returns 400 and appears in logs with a clear reason code. 4. Firestore write succeeds under production-like security rules using service credentials only where intended. 5. No secret values appear in logs or client builds. 6. A simulated downstream failure leaves status as "pending" or "failed", not invisible success.
QA checks I would run:
- Send one real sandbox payment through the full funnel from Flutter checkout to Firebase fulfillment record creation.
- Replay the same webhook event twice and confirm no duplicate state change occurs across both runs.
- Break one dependency at a time:
- remove an env var,
- deny Firestore access,
- force an invalid signature,
- simulate timeout on email service call,
then verify each case fails predictably with logs and alerting enabled.
I also want basic monitoring gates:
- Error rate below 1 percent on webhook requests over 24 hours
- p95 handler latency below 800 ms if inline processing remains
- Zero unclassified exceptions during smoke testing
- At least one alert route tested before release
Prevention
This failure should not come back quietly again. The fix is not just code; it is guardrails around deploys, security checks on API traffic, and better visibility into funnel health.
1. Add monitoring on business events as well as server errors.
- Track "payment_succeeded", "webhook_received", "fulfillment_created", and "access_granted".
- If payments rise but fulfillments drop, you catch revenue loss early.
2. Add alerting for missing conversions within a time window.
- Example: alert if paid checkout count exceeds fulfilled access count by more than 3 percent over 15 minutes at peak traffic.
3. Review webhook code with API security in mind every time it changes.
- Check authentication assumptions,
authorization boundaries, input validation, secret handling, rate limits, CORS, logging, dependency risk, and least privilege access to Firestore and external APIs.
4. Keep client app logic out of trust decisions where possible.
- Flutter should display state from Firebase records rather than assume payment means access was granted instantly on-device.
5. Use safe deploy practices for production functions. ```mermaid flowchart TD A[Pay] --> B[Hook] B --> C[Log] C --> D[Write] D --> E[Grant] E --> F[Alert]
6. Reduce blast radius with smaller releases: - one function change per deploy, - one env change per deploy, - one funnel route verified per deploy 7. Add lightweight review checks: - Does this change preserve raw request body? - Does it handle duplicates? - Does it log event IDs? - Can support tell what happened without asking engineering? ## When to Use Launch Ready Launch Ready fits when you need the whole delivery layer cleaned up fast: domain setup, email deliverability basics like SPF/DKIM/DMARC, Cloudflare config, SSL checks, deployment hardening, secrets management setup, uptime monitoring hooks fit together in one pass rather than being patched piecemeal over weeks. - your funnel works locally but fails in production, - your paid ads are live but conversions are leaking, - your webhooks need production-safe deployment plus monitoring, - you want DNS redirects/subdomains/SSL/caching/DDoS protection handled correctly before spending more on traffic What you should prepare before booking: - Firebase project access - Payment provider dashboard access - Domain registrar access - Cloudflare account access - Current production/staging URLs - List of env vars currently used - One clear description of what should happen after successful payment If you bring me those pieces cleanly organized by account owner and environment name today means I can spend my time fixing the actual failure path instead of chasing permissions for half of day one. ## References - https://roadmap.sh/api-security-best-practices - https://roadmap.sh/qa - https://roadmap.sh/code-review-best-practices - https://firebase.google.com/docs/functions - https://firebase.google.com/docs/firestore/security/get-started --- ## Take the next step If this is a problem in your product right now, here is what to do next: - **[Use the free Cyprian tools](/tools)** - estimate cost, score app risk, check launch readiness, or pick the right service sprint. - **[Book a discovery call](/contact)** - I will tell you honestly whether you need a sprint or if you can DIY the next step. *Written by Cyprian Tinashe Aarons - senior full-stack and AI engineer helping founders rescue, launch, automate, and scale AI-built products.*
Cyprian Tinashe Aarons — Senior Full Stack & AI Engineer
Cyprian helps founders rescue, secure, deploy, and automate AI-built apps with production-grade engineering, launch systems, and AI integration.