Error Handling and Retry Logic in API Integrations
Production API integrations face a hostile environment — network failures, transient errors, rate limits, authentication expiry, and third-party outages. Robust error handling and retry logic is the difference between integrations that degrade gracefully and those that cascade failures through your system.
HTTP Status Code Taxonomy
- 2xx Success: Request succeeded — 200 OK, 201 Created, 204 No Content
- 3xx Redirect: Client should follow redirect
- 4xx Client errors: The request is wrong — 400 Bad Request (fix your request), 401 Unauthorised (authenticate), 403 Forbidden (insufficient permissions), 404 Not Found, 422 Unprocessable (validation failed), 429 Too Many Requests (back off)
- 5xx Server errors: The server failed — potentially transient. 500 Internal Server Error, 502 Bad Gateway, 503 Service Unavailable, 504 Gateway Timeout
What to Retry
Retry transient failures: 429 Too Many Requests (after respecting the Retry-After header), 500/502/503/504 server errors (where the request was idempotent), network timeouts and connection errors. Do not retry 4xx client errors (except 429) — they will fail identically.
Exponential Backoff with Jitter
Retry with exponential backoff: wait 1 second, then 2, then 4, then 8 — capping at a maximum. Add random jitter to prevent thundering herd — many clients retrying simultaneously. Maximum retry attempts: typically 3-5 for synchronous requests, more for background jobs.
Idempotency
Only retry requests that are idempotent — the same request produces the same result regardless of how many times it is sent. GET, PUT, DELETE are naturally idempotent. POST operations must use idempotency keys (provided by the API or generated by your client) to enable safe retry.