Rate Limiting - Interactive Guide

Protect your APIs from abuse with rate limiting. This guide covers the Token Bucket algorithm— the approach used by AWS, Stripe, and GitHub in production.

How to use: Toggle between Visual to understand the algorithm and Code to implement it. Try the simulator to see rate limiting in action.

The Problem

One bad actor (or buggy client) sends 10,000 requests/second and crashes your server. Your legitimate users can't access the service. You need a way to enforce fair usage limits.

Token Bucket Simulator

Incoming Requests

Tokens

+1 token/sec

Server

⚡

Allowed

Rejected

How It Works

Request Arrives

Check Token Bucket

Tokens Available?

Yes

✓ Process Request
Consume 1 token

✗ Return 429
Too Many Requests

Key Insight: The bucket allows short bursts (up to capacity) while enforcing a long-term average rate. A 10-token bucket with 1 token/sec refill allows 10 quick requests, then 1 request/second after.

Token Bucket Implementation

class TokenBucket {
  constructor(capacity, refillRate) {
    this.tokens = capacity;        // Max tokens (e.g., 10)
    this.capacity = capacity;
    this.refillRate = refillRate;  // Tokens per second
    this.lastRefill = Date.now();
  }

  tryConsume() {
    this.refill();

    if (this.tokens >= 1) {
      this.tokens -= 1;
      return true;  // Request allowed
    }
    return false;    // Rate limited (429)
  }

  refill() {
    const now = Date.now();
    const elapsed = (now - this.lastRefill) / 1000;
    this.tokens = Math.min(
      this.capacity,
      this.tokens + elapsed * this.refillRate
    );
    this.lastRefill = now;
  }
}

// Usage in Express middleware
const limiter = new TokenBucket(10, 1); // 10 tokens, 1/sec refill

app.use((req, res, next) => {
  if (!limiter.tryConsume()) {
    return res.status(429).json({
      error: 'Too many requests',
      retryAfter: 1
    });
  }
  next();
});

Pitfall: This only works for a single server. In distributed systems, use Redis to share state across servers.

Distributed Rate Limiting (Redis)

async function isRateLimited(userId) {
  const key = `ratelimit:${userId}`;
  const limit = 100;
  const window = 60; // seconds

  const current = await redis.incr(key);

  if (current === 1) {
    await redis.expire(key, window);
  }

  return current > limit; // true = blocked
}

Production Tips:

Always return Retry-After header with 429 responses
Use different limits for authenticated vs anonymous users
Consider libraries: express-rate-limit, rate-limiter-flexible

When to Use

Public APIs (prevent abuse)
Login endpoints (prevent brute force)
Expensive operations (AI, uploads)

When to Skip

Internal service calls (use circuit breaker)
Health check endpoints

Related Patterns

Rate limiting pairs well with Circuit Breaker (for internal services) and API Gateway (for centralized enforcement). Coming soon as separate interactive guides.