Glossary

Rate Limiting

Rate limiting controls how many requests a client can make to an API within a given time window, protecting against abuse, DoS attacks, and resource exhaustion.

Explanation

Without rate limiting, a single buggy script, malicious attacker, or runaway loop can overwhelm your API, degrading service for everyone. Rate limiting enforces a maximum request rate per client (identified by API key, user ID, or IP). Algorithms: Fixed window (count requests in fixed time buckets — simple but allows bursts at window boundaries), Sliding window (rolling time window — smoother, prevents boundary bursts), Token bucket (clients get N tokens per second; each request consumes one token; tokens accumulate to a maximum — allows controlled bursts while enforcing average rate), Leaky bucket (process requests at fixed rate — smooths traffic). Token bucket is the most widely used for APIs. Response: return HTTP 429 Too Many Requests with a Retry-After header indicating when the client can retry. Include X-RateLimit-Limit (total allowed), X-RateLimit-Remaining (remaining), and X-RateLimit-Reset (when it resets). Distributed rate limiting (across multiple servers) requires a shared store — Redis is standard. A per-server rate limiter allows N×servers requests per window, defeating the purpose.

Code Example

javascript
// Distributed rate limiting with express-rate-limit + Redis
const rateLimit = require('express-rate-limit');
const RedisStore = require('rate-limit-redis');
const { createClient } = require('redis');

const redis = createClient({ url: process.env.REDIS_URL });
await redis.connect();

// Global: 100 requests per 15 minutes per IP
const globalLimit = rateLimit({
  windowMs: 15 * 60 * 1000,
  max: 100,
  standardHeaders: true, // sets X-RateLimit-* headers
  store: new RedisStore({ client: redis }),
  handler: (req, res) => {
    res.status(429).json({
      error: 'Too many requests',
      retryAfter: res.getHeader('Retry-After'),
    });
  },
});

// Stricter for auth endpoints
const authLimit = rateLimit({
  windowMs: 15 * 60 * 1000,
  max: 10, // 10 login attempts per 15 min
  store: new RedisStore({ client: redis, prefix: 'auth:' }),
});

app.use('/api/', globalLimit);
app.use('/api/auth/', authLimit);

Why It Matters for Engineers

Rate limiting is a non-negotiable requirement for any public API. Without it, a single client can consume all your server resources. It's also a security control: rate limiting login endpoints prevents credential stuffing; rate limiting password reset prevents email flooding. 'Design a rate limiter' is a classic system design interview question that tests knowledge of distributed counters, Redis, and algorithm trade-offs. Understanding why distributed rate limiting requires a shared store demonstrates practical systems thinking.

Related Terms

API · Idempotency · Cache · Middleware

Learn This In Practice

Go deeper with the full module on Beyond Vibe Code.

Systems Design Fundamentals → →