Idempotency in Distributed APIs — Part 4: Retry Logic and Exponential Backoff → Explore with me!

Idempotency on the server is only half the story. The other half is the client knowing when and how to retry safely. A client that retries too aggressively can turn a recovering service into a failed one. A client that never retries wastes the safety guarantees you just built.

This part covers the mechanics of retry logic: exponential backoff, jitter, retry budgets, and which errors are safe to retry at all.

Which Errors Are Safe to Retry

Not every error means “try again.” Retrying the wrong errors wastes resources or makes things worse.

Condition	Retry?	Why
Network timeout	Yes	Request may not have reached server
Connection refused	Yes	Server temporarily unavailable
5xx Server Error	Yes (with limits)	Server-side transient failure
429 Too Many Requests	Yes (with Retry-After)	Rate limited, back off first
408 Request Timeout	Yes	Server did not finish in time
400 Bad Request	No	Client error, retrying won’t help
401 Unauthorized	No	Fix auth first
404 Not Found	No	Resource does not exist
422 Unprocessable	No	Logic error, not transient

Exponential Backoff

The simplest retry strategy — retrying immediately after failure — is also the most dangerous at scale. If 10,000 clients all fail simultaneously and all retry at once, they generate a thundering herd that can push a recovering server back into failure.

Exponential backoff spaces retries out by doubling the wait time between each attempt. The delay after attempt n is roughly: base_delay * 2^n.

gantt
    title Retry Timeline (base_delay = 1s, max = 30s)
    dateFormat X
    axisFormat %Ls

    section Attempts
    Attempt 1 (fail)     :0, 1
    Wait 1s              :1, 2
    Attempt 2 (fail)     :2, 3
    Wait 2s              :3, 5
    Attempt 3 (fail)     :5, 6
    Wait 4s              :6, 10
    Attempt 4 (fail)     :10, 11
    Wait 8s              :11, 19
    Attempt 5 (success)  :19, 20

Adding Jitter

Backoff alone is not enough if all clients started at the same time. They will still all retry at the same intervals in sync — the thundering herd just moves to later. Jitter adds randomness to the delay so clients desynchronize.

AWS recommends “full jitter”: pick a random value between zero and the calculated backoff. This spreads retries uniformly across the window and significantly reduces load spikes on recovering services.

use std::time::Duration;
use rand::Rng;

pub struct RetryConfig {
    pub max_attempts: u32,
    pub base_delay_ms: u64,
    pub max_delay_ms: u64,
}

impl RetryConfig {
    pub fn delay_for_attempt(&self, attempt: u32) -> Duration {
        // Exponential backoff: base * 2^attempt
        let exponential = self.base_delay_ms * 2u64.pow(attempt);
        // Cap at max delay
        let capped = exponential.min(self.max_delay_ms);
        // Full jitter: random between 0 and capped
        let jittered = rand::thread_rng().gen_range(0..=capped);
        Duration::from_millis(jittered)
    }
}

// Usage
let config = RetryConfig {
    max_attempts: 5,
    base_delay_ms: 200,
    max_delay_ms: 30_000, // 30 seconds cap
};

A Full Retry Client in Rust

Here is a complete retry wrapper using reqwest that handles idempotency key persistence across attempts and applies full jitter backoff.

use reqwest::{Client, Response, StatusCode};
use serde::Serialize;
use std::time::Duration;
use tokio::time::sleep;
use uuid::Uuid;

pub struct IdempotentClient {
    inner: Client,
    config: RetryConfig,
}

impl IdempotentClient {
    pub fn new(config: RetryConfig) -> Self {
        Self {
            inner: Client::new(),
            config,
        }
    }

    pub async fn post_with_retry<T: Serialize>(
        &self,
        url: &str,
        body: &T,
    ) -> anyhow::Result<Response> {
        // Generate key once -- reused across all retries
        let idempotency_key = Uuid::new_v4().to_string();
        let body_bytes = serde_json::to_vec(body)?;

        let mut last_error = None;

        for attempt in 0..self.config.max_attempts {
            let result = self
                .inner
                .post(url)
                .header("Idempotency-Key", &idempotency_key)
                .header("Content-Type", "application/json")
                .body(body_bytes.clone())
                .timeout(Duration::from_secs(30))
                .send()
                .await;

            match result {
                Ok(resp) => {
                    let status = resp.status();

                    // Success
                    if status.is_success() {
                        return Ok(resp);
                    }

                    // Do not retry client errors (4xx), except 408 and 429
                    if status.is_client_error()
                        && status != StatusCode::REQUEST_TIMEOUT
                        && status != StatusCode::TOO_MANY_REQUESTS
                    {
                        return Ok(resp); // Return to caller to handle
                    }

                    // For 429, respect Retry-After header if present
                    if status == StatusCode::TOO_MANY_REQUESTS {
                        if let Some(retry_after) = resp.headers().get("Retry-After") {
                            if let Ok(secs) = retry_after.to_str().unwrap_or("0").parse::<u64>() {
                                sleep(Duration::from_secs(secs)).await;
                                continue;
                            }
                        }
                    }

                    last_error = Some(anyhow::anyhow!("Server error: {}", status));
                }
                Err(e) if e.is_timeout() || e.is_connect() => {
                    last_error = Some(anyhow::anyhow!("Network error: {}", e));
                }
                Err(e) => {
                    return Err(anyhow::anyhow!("Non-retryable error: {}", e));
                }
            }

            // Not the last attempt -- wait before retrying
            if attempt + 1 < self.config.max_attempts {
                let delay = self.config.delay_for_attempt(attempt);
                tracing::warn!(
                    attempt = attempt + 1,
                    delay_ms = delay.as_millis(),
                    key = %idempotency_key,
                    "Retrying request"
                );
                sleep(delay).await;
            }
        }

        Err(last_error.unwrap_or_else(|| anyhow::anyhow!("Max retries exceeded")))
    }
}

Retry Budgets

Exponential backoff caps how long any single retry sequence runs. But in a microservice with many instances, each making its own retries, the aggregate retry load can still overwhelm a downstream service. A retry budget limits the fraction of total requests that can be retries at any given time.

The idea: track how many of your last N requests were retries. If retries exceed a threshold (say 10%), stop retrying and fail fast. This prevents a cascade where one struggling service causes all its callers to saturate it further with retry traffic.

use std::sync::atomic::{AtomicU32, Ordering};
use std::sync::Arc;

pub struct RetryBudget {
    total_requests: Arc<AtomicU32>,
    retry_requests: Arc<AtomicU32>,
    max_retry_fraction: f64, // e.g. 0.10 for 10%
}

impl RetryBudget {
    pub fn can_retry(&self) -> bool {
        let total = self.total_requests.load(Ordering::Relaxed) as f64;
        let retries = self.retry_requests.load(Ordering::Relaxed) as f64;
        if total == 0.0 {
            return true;
        }
        (retries / total) < self.max_retry_fraction
    }

    pub fn record_attempt(&self, is_retry: bool) {
        self.total_requests.fetch_add(1, Ordering::Relaxed);
        if is_retry {
            self.retry_requests.fetch_add(1, Ordering::Relaxed);
        }
    }
}

Non-Idempotent Operations: Do Not Retry

There is one hard rule: never retry a request without an idempotency key if the operation has side effects. Without a key, the server cannot deduplicate — and retrying is identical to sending two separate requests. The client code we built above always attaches an idempotency key for POST requests, which is exactly right. Remove the key and the retry logic becomes dangerous.

Summary

Retries are necessary for reliability. Naive retries cause thundering herds and duplicate side effects. Exponential backoff with full jitter spreads retry load. Retry budgets prevent cascading amplification. And none of this is safe without idempotency keys on the server side. In Part 5, we move to a different delivery mechanism entirely — message queues — where the retry semantics are controlled by the broker rather than the client.

Idempotency in Distributed APIs — Part 4: Retry Logic and Exponential Backoff

Which Errors Are Safe to Retry

Exponential Backoff

Adding Jitter

A Full Retry Client in Rust

Retry Budgets

Non-Idempotent Operations: Do Not Retry

Summary

References

Like this:

You may like

Written by:

Chandan 657 Posts

You May Have Missed

The Complete Picture: Balancing Professional and Personal Support Systems

For Parents, Partners, and Friends: A Guide to Supporting Your Loved One in Tech

The HR Conversation: When and How to Involve HR in Your Mental Health Journey

Finding Your Tech Tribe: The Power of Peer Support Groups

How to whitelist website on AdBlocker?

Which Errors Are Safe to Retry

Exponential Backoff

Adding Jitter

A Full Retry Client in Rust

Retry Budgets

Non-Idempotent Operations: Do Not Retry

Summary

References

Like this:

You may like

Written by:

Chandan 657 Posts

Related Posts

Idempotency in Distributed APIs — Part 3: Building Idempotent Endpoints in Rust with Axum

Idempotency in Distributed APIs — Part 2: Idempotency Keys, Design, and Storage

Idempotency in Distributed APIs — Part 1: What It Is and Why It Breaks Everything

You May Have Missed

The Complete Picture: Balancing Professional and Personal Support Systems

For Parents, Partners, and Friends: A Guide to Supporting Your Loved One in Tech

The HR Conversation: When and How to Involve HR in Your Mental Health Journey

Finding Your Tech Tribe: The Power of Peer Support Groups

How to whitelist website on AdBlocker?