How I Would Fix webhooks failing silently in a React Native and Expo mobile app Using Launch Ready.
The symptom is usually ugly but easy to miss: the app says 'saved' or 'connected', but the downstream system never receives the webhook, and nobody gets...
How I Would Fix webhooks failing silently in a React Native and Expo mobile app Using Launch Ready
The symptom is usually ugly but easy to miss: the app says "saved" or "connected", but the downstream system never receives the webhook, and nobody gets an error. In a React Native and Expo stack, the most likely root cause is not the webhook provider itself, but how the mobile app triggers the request, stores state, or handles retries after the user leaves the screen.
The first thing I would inspect is whether the webhook is being sent from the mobile client at all. If it is, I would then check whether the request is being blocked by auth, CORS-like assumptions, bad environment variables, or a missing server-side relay. In most cases, I recommend moving webhook delivery out of the mobile app and into a backend or edge function so you are not trusting a phone app to act like infrastructure.
Triage in the First Hour
1. Check whether any webhook events were actually received by the destination.
- Look at provider logs first: Stripe, Slack, HubSpot, Zapier, Make, your own API.
- Confirm timestamps, event IDs, and HTTP response codes.
2. Inspect app logs from real devices.
- Use Sentry, LogRocket, Firebase Crashlytics, or Expo device logs.
- Search for failed fetch calls, timeouts, 401s, 403s, 500s, and network errors.
3. Verify environment variables in the build that shipped.
- Compare local `.env`, EAS build profile values, staging values, and production values.
- Confirm webhook URLs are present and not pointing to localhost or staging.
4. Check whether the app is sending from client-side code.
- Find where `fetch`, `axios`, or SDK calls are made.
- If secrets are inside the app bundle, treat that as a production risk.
5. Review backend or edge logs if a relay exists.
- Check request count versus expected count.
- Look for rate limits, validation failures, signature failures, or queue backlog.
6. Inspect network conditions on device.
- Test on Wi-Fi and cellular.
- Check backgrounding behavior and whether requests die when the app moves to background.
7. Review recent deploys and config changes.
- New EAS build?
- New domain?
- Cloudflare proxy change?
- SSL certificate renewal issue?
- Redirect loop?
8. Confirm monitoring coverage.
- If there is no alert for failed webhook delivery within 5 minutes, that is part of the problem.
Root Causes
| Likely cause | How to confirm | Why it fails silently | |---|---|---| | Webhook sent directly from React Native client | Search code for direct external POST requests with secrets or signed payloads | Mobile apps are unreliable delivery agents and can be killed mid-request | | Missing or wrong env vars in Expo/EAS build | Compare runtime config against expected values in build logs | The app may "work" locally but post to nowhere in production | | Request blocked by auth or signature mismatch | Check server logs for 401/403/422 responses | The UI may not surface non-200 responses clearly | | Background execution interrupted | Reproduce by sending then switching apps immediately | Mobile OS can suspend network work before completion | | Retry logic absent or broken | Inspect queue/retry code; look for one-shot fire-and-forget calls | A transient failure becomes permanent data loss | | API endpoint protected by Cloudflare/WAF rules | Review firewall events and bot protection logs | Requests get dropped before your app sees them |
The biggest architectural mistake I see is using a mobile client as if it were a backend worker. That creates launch risk: failed deliveries, support tickets from users who think they completed an action successfully, and data inconsistency between systems.
The Fix Plan
1. Move webhook delivery off the mobile client.
- The React Native app should call your own API only.
- Your API should enqueue or forward the webhook server-side.
2. Add an idempotent delivery layer.
- Store each outbound event with a unique ID.
- Mark states like `pending`, `sent`, `failed`, and `retrying`.
- Prevent duplicate sends when users tap twice or reconnect.
3. Validate inputs before sending anything out.
- Reject missing user IDs, malformed payloads, empty event types, and invalid URLs early.
- Return clear errors to the app instead of pretending success.
4. Add retries with backoff on safe failures only.
- Retry 3 times over 10 minutes for network errors and 5xx responses.
- Do not retry on hard failures like 400-level validation errors unless corrected.
5. Log every outbound attempt with correlation IDs.
- Store request ID, user ID, event name, destination service, status code, latency, and error message.
- Make sure sensitive fields are redacted.
6. Put secrets in server-side environment variables only.
- Never ship signing keys inside Expo public config if they can be used to send webhooks directly.
- Rotate any exposed keys immediately.
7. Add a dead-letter path for failed deliveries.
- Failed jobs after max retries should be flagged for manual review or reprocessing.
- This prevents silent loss.
8. Tighten API security controls around the relay endpoint.
- Require authentication on internal endpoints.
- Rate limit abusive traffic.
- Restrict CORS if there is any browser-facing surface involved.
- Validate signatures on inbound webhooks if you also receive them.
A simple diagnostic pattern I would use during repair:
curl -i https://api.yourdomain.com/webhooks/test \
-H "Authorization: Bearer $TOKEN" \
-H "Content-Type: application/json" \
--data '{"event":"test.delivery","userId":"123"}'If this returns success but nothing arrives downstream, the bug is in your relay or queue layer. If it returns an error but the app still shows success, your UI state handling is lying to users and needs fixing immediately.
Regression Tests Before Redeploy
I would not ship this fix without testing both delivery correctness and failure behavior. Silent failures are worse than visible failures because they create false confidence.
Acceptance criteria:
- A successful trigger creates exactly one outbound event record.
- The destination receives the webhook within 5 seconds under normal conditions.
- A network timeout triggers retry logic without duplicate payloads.
- A 400-level validation error shows a visible user-facing error state in the app.
- A failed delivery appears in logs and monitoring within 1 minute.
- Secrets do not appear in client bundles or public Expo config files.
QA checks:
- Test on iPhone and Android devices with airplane mode toggled mid-request.
- Test backgrounding immediately after tapping submit.
- Test slow networks and packet loss simulation if possible.
- Test duplicate taps on CTA buttons to confirm idempotency works.
- Test one real production-like event end to end before release.
I would also add one negative test set:
- expired token
- invalid payload shape
- destination returning 429
- destination returning 500
- missing env var
- stale build using old config
If you have CI/CD already wired up through EAS Build or another pipeline, I would block release unless these checks pass:
- unit coverage for delivery logic at least 80%
- integration test for outbound webhook flow
- lint plus typecheck
- one smoke test against staging
- rollback plan documented
Prevention
The long-term fix is not just code. It is making silent failure impossible to ignore.
Monitoring guardrails:
- Alert if webhook success rate drops below 98 percent over 15 minutes.
- Alert if p95 delivery latency exceeds 2 seconds for internal relay endpoints or 5 seconds for third-party providers where you control neither side nor their queue depth fully.
- Alert on repeated retries from one user session or one device build version.
Code review guardrails:
- No direct secret-bearing outbound calls from mobile clients unless there is no alternative and risk has been reviewed carefully.
- Any new webhook path must include logging, timeout handling at 3 to 10 seconds depending on dependency behavior, retry rules, and idempotency keys.
- Review must check behavior first: what happens when offline? what happens when auth expires? what happens when endpoint returns non-200?
Security guardrails:
- Keep signing keys server-side only when possible.
- Use least privilege service accounts for any queue worker or relay service.
- Redact payload fields that could expose customer data in logs.
UX guardrails:
- Show explicit states: sending, sent, failed, retrying later.
- Give users a recovery action if delivery fails: retry now or contact support with reference ID.
- Do not show "success" until you have verified at least acceptance by your own backend.
Performance guardrails:
- Keep client-side work light so submission does not get dropped during navigation changes or rendering delays.
- Avoid heavy synchronous work right after submit; that can delay network dispatch on slower phones.
When to Use Launch Ready
Launch Ready fits when you need this fixed fast without turning it into a long rebuild.
This sprint makes sense if:
- your product already works in parts but production behavior is unreliable
- you need a safe launch path in under 2 days
- you suspect config drift across Expo builds or environments
- you want one senior engineer to audit flow end to end instead of patching symptoms
What I need from you before starting:
- access to Expo/EAS project settings
- repository access
- hosting access for API/relay/backend
- Cloudflare access if DNS or firewall rules are involved
- list of webhook providers affected
- sample failing event IDs if available
- screenshots or screen recordings of the broken flow
My approach would be simple: verify where events die first, move delivery into infrastructure you control second, then add observability so this does not become another silent outage next month.
References
1. https://roadmap.sh/api-security-best-practices 2. https://roadmap.sh/qa 3. https://roadmap.sh/code-review-best-practices 4. https://docs.expo.dev/ 5. https://developer.mozilla.org/en-US/docs/Web/API/Fetch_API
---
Take the next step
If this is a problem in your product right now, here is what to do next:
- [Use the free Cyprian tools](/tools) - estimate cost, score app risk, check launch readiness, or pick the right service sprint.
- [Book a discovery call](/contact) - I will tell you honestly whether you need a sprint or if you can DIY the next step.
*Written by Cyprian Tinashe Aarons - senior full-stack and AI engineer helping founders rescue, launch, automate, and scale AI-built products.*
Cyprian Tinashe Aarons — Senior Full Stack & AI Engineer
Cyprian helps founders rescue, secure, deploy, and automate AI-built apps with production-grade engineering, launch systems, and AI integration.