Understanding Rate Limiting Algorithms and Implementation (Part 2 of 3) → Explore with me!

Now that you understand what rate limiting is and why it matters, let’s explore how it actually works. This post will give you the technical foundation to understand and implement basic rate limiting strategies.

How Rate Limiting Works

At its core, rate limiting involves three key components:

1. Identity: Who is making the request? (IP address, user ID, API key)
2. Counter: How many requests has this identity made?
3. Time Window: Within what time period are we counting?

The system checks each incoming request against these rules and either allows it through or blocks it.

Common Rate Limiting Algorithms

1. Fixed Window Counter

This is the simplest approach. Imagine dividing time into fixed chunks (like 1-minute windows) and counting requests in each chunk.

Window 1 (0:00-0:59): 50 requests ✅
Window 2 (1:00-1:59): 100 requests ✅  
Window 3 (2:00-2:59): 150 requests ❌ (limit: 100)

Pros: Simple to implement, memory efficient
Cons: Can allow bursts at window boundaries

2. Sliding Window Log

Instead of fixed windows, this approach tracks the exact timestamp of each request and slides the time window continuously.

Current time: 14:30:45
Look back 1 minute: Count requests from 14:29:45 to 14:30:45
Every second, the window slides forward

Pros: Very accurate, no burst issues
Cons: Memory intensive (stores all timestamps)

3. Token Bucket

Imagine a bucket that gets filled with tokens at a steady rate. Each request consumes a token. When the bucket is empty, requests are blocked.

Bucket capacity: 100 tokens
Refill rate: 10 tokens per minute
Request comes in: Remove 1 token
If bucket empty: Block request

Pros: Allows controlled bursts, smooth traffic flow
Cons: More complex to understand and implement

4. Leaky Bucket

Requests enter a queue (bucket) and are processed at a fixed rate, like water dripping from a leaky bucket.

Queue size: 50 requests
Processing rate: 5 requests per second
If queue full: Drop new requests

Pros: Smooths out traffic spikes
Cons: Can introduce latency for legitimate requests

Implementation Considerations

Where to Implement Rate Limiting

API Gateway: Centralized control, applies to all services
Application Level: More granular control, service-specific rules
Database/Cache: Protect backend resources directly
CDN/Proxy: Filter traffic before it reaches your infrastructure

Storage Options

In-Memory: Fast but doesn’t survive server restarts
Redis: Fast, persistent, perfect for distributed systems
Database: Reliable but slower, good for complex rules

Basic Implementation Example

Here’s a simple Node.js example using Redis for a fixed window counter:

const redis = require('redis');
const client = redis.createClient();

async function checkRateLimit(userId, limit = 100, windowSize = 60) {
  const key = `rate_limit:${userId}`;
  const current = Date.now();
  const window = Math.floor(current / (windowSize * 1000));
  const windowKey = `${key}:${window}`;
  
  const requests = await client.incr(windowKey);
  
  if (requests === 1) {
    // Set expiration for cleanup
    await client.expire(windowKey, windowSize);
  }
  
  return requests <= limit;
}

// Usage
if (await checkRateLimit('user123')) {
  // Allow request
  processRequest();
} else {
  // Block request
  return res.status(429).json({ error: 'Rate limit exceeded' });
}

Best Practices for Beginners

Start Simple: Begin with fixed window counters before moving to complex algorithms

Choose Appropriate Limits: Monitor your normal traffic patterns and set limits accordingly

Provide Clear Responses: Include helpful headers like X-Rate-Limit-Remaining and Retry-After

Test Thoroughly: Verify your implementation handles edge cases and doesn't block legitimate users

Monitor and Adjust: Track blocked requests and adjust limits based on real usage patterns

Common HTTP Headers

X-Rate-Limit-Limit: 1000        # Total requests allowed
X-Rate-Limit-Remaining: 742     # Requests left in window
X-Rate-Limit-Reset: 1640995200  # When limit resets (timestamp)
Retry-After: 60                 # Seconds to wait before retrying

What's Next?

You now have the foundation to implement basic rate limiting! In Part 3, we'll explore advanced enterprise-level strategies including distributed rate limiting, dynamic limits, and sophisticated attack prevention techniques.

This is Part 2 of our 3-part series on API Gateway Rate Limiting. Check out Part 1 for the basics and stay tuned for Part 3 covering advanced enterprise strategies.

Understanding Rate Limiting Algorithms and Implementation (Part 2 of 3)

How Rate Limiting Works

Common Rate Limiting Algorithms

1. Fixed Window Counter

2. Sliding Window Log

3. Token Bucket

4. Leaky Bucket

Implementation Considerations

Where to Implement Rate Limiting

Storage Options

Basic Implementation Example

Best Practices for Beginners

Common HTTP Headers

What's Next?

Like this:

You may like

Written by:

Chandan 439 Posts

You May Have Missed

Letter to My Younger Self: You Don’t Have to Work Nights and Weekends

Letter to My Younger Self: It’s Okay to Say No

Letter to My Younger Self: You’re Not a Fraud

Letter to My Younger Self: About Burnout I Didn’t See Coming

How Rate Limiting Works

Common Rate Limiting Algorithms

1. Fixed Window Counter

2. Sliding Window Log

3. Token Bucket

4. Leaky Bucket

Implementation Considerations

Where to Implement Rate Limiting

Storage Options

Basic Implementation Example

Best Practices for Beginners

Common HTTP Headers

What's Next?

Like this:

You may like

Written by:

Chandan 439 Posts

Related Posts

Python Flask: A Beginner’s Guide to Building Web Applications

Building Your First MCP Server with Node.js

MediaPipe on the Web: Building Browser-Based Computer Vision Apps with JavaScript

You May Have Missed

Letter to My Younger Self: You Don’t Have to Work Nights and Weekends

Letter to My Younger Self: It’s Okay to Say No

Letter to My Younger Self: You’re Not a Fraud

Letter to My Younger Self: About Burnout I Didn’t See Coming