Now that you understand what rate limiting is and why it matters, let’s explore how it actually works. This post will give you the technical foundation to understand and implement basic rate limiting strategies.
How Rate Limiting Works
At its core, rate limiting involves three key components:
1. Identity: Who is making the request? (IP address, user ID, API key)
2. Counter: How many requests has this identity made?
3. Time Window: Within what time period are we counting?
The system checks each incoming request against these rules and either allows it through or blocks it.
Common Rate Limiting Algorithms
1. Fixed Window Counter
This is the simplest approach. Imagine dividing time into fixed chunks (like 1-minute windows) and counting requests in each chunk.
Window 1 (0:00-0:59): 50 requests ✅
Window 2 (1:00-1:59): 100 requests ✅
Window 3 (2:00-2:59): 150 requests ❌ (limit: 100)
Pros: Simple to implement, memory efficient
Cons: Can allow bursts at window boundaries
2. Sliding Window Log
Instead of fixed windows, this approach tracks the exact timestamp of each request and slides the time window continuously.
Current time: 14:30:45
Look back 1 minute: Count requests from 14:29:45 to 14:30:45
Every second, the window slides forward
Pros: Very accurate, no burst issues
Cons: Memory intensive (stores all timestamps)
3. Token Bucket
Imagine a bucket that gets filled with tokens at a steady rate. Each request consumes a token. When the bucket is empty, requests are blocked.
Bucket capacity: 100 tokens
Refill rate: 10 tokens per minute
Request comes in: Remove 1 token
If bucket empty: Block request
Pros: Allows controlled bursts, smooth traffic flow
Cons: More complex to understand and implement
4. Leaky Bucket
Requests enter a queue (bucket) and are processed at a fixed rate, like water dripping from a leaky bucket.
Queue size: 50 requests
Processing rate: 5 requests per second
If queue full: Drop new requests
Pros: Smooths out traffic spikes
Cons: Can introduce latency for legitimate requests
Implementation Considerations
Where to Implement Rate Limiting
API Gateway: Centralized control, applies to all services
Application Level: More granular control, service-specific rules
Database/Cache: Protect backend resources directly
CDN/Proxy: Filter traffic before it reaches your infrastructure
Storage Options
In-Memory: Fast but doesn’t survive server restarts
Redis: Fast, persistent, perfect for distributed systems
Database: Reliable but slower, good for complex rules
Basic Implementation Example
Here’s a simple Node.js example using Redis for a fixed window counter:
const redis = require('redis');
const client = redis.createClient();
async function checkRateLimit(userId, limit = 100, windowSize = 60) {
const key = `rate_limit:${userId}`;
const current = Date.now();
const window = Math.floor(current / (windowSize * 1000));
const windowKey = `${key}:${window}`;
const requests = await client.incr(windowKey);
if (requests === 1) {
// Set expiration for cleanup
await client.expire(windowKey, windowSize);
}
return requests <= limit;
}
// Usage
if (await checkRateLimit('user123')) {
// Allow request
processRequest();
} else {
// Block request
return res.status(429).json({ error: 'Rate limit exceeded' });
}
Best Practices for Beginners
Start Simple: Begin with fixed window counters before moving to complex algorithms
Choose Appropriate Limits: Monitor your normal traffic patterns and set limits accordingly
Provide Clear Responses: Include helpful headers like X-Rate-Limit-Remaining and Retry-After
Test Thoroughly: Verify your implementation handles edge cases and doesn't block legitimate users
Monitor and Adjust: Track blocked requests and adjust limits based on real usage patterns
Common HTTP Headers
X-Rate-Limit-Limit: 1000 # Total requests allowed
X-Rate-Limit-Remaining: 742 # Requests left in window
X-Rate-Limit-Reset: 1640995200 # When limit resets (timestamp)
Retry-After: 60 # Seconds to wait before retrying
What's Next?
You now have the foundation to implement basic rate limiting! In Part 3, we'll explore advanced enterprise-level strategies including distributed rate limiting, dynamic limits, and sophisticated attack prevention techniques.
This is Part 2 of our 3-part series on API Gateway Rate Limiting. Check out Part 1 for the basics and stay tuned for Part 3 covering advanced enterprise strategies.