Production-Ready Rate Limiting System: Redis and Node.js Implementation Guide with Token Bucket Algorithm

js

Production-Ready Rate Limiting System: Redis and Node.js Implementation Guide with Token Bucket Algorithm

Learn to build a production-ready rate limiting system with Redis and Node.js. Master token bucket, sliding window algorithms, and distributed rate limiting.

Nov 17, 2025

Production-Ready Rate Limiting System: Redis and Node.js Implementation Guide with Token Bucket Algorithm

I was building a web service that suddenly started getting hammered by unexpected traffic. Our servers were struggling, and I realized we needed a solid way to control how many requests each user could make. That’s when I dove into creating a production-ready rate limiting system with Redis and Node.js. If you’ve ever worried about your API being abused or your resources drained, this is for you. Let’s build something robust together.

Rate limiting is about controlling how often someone can do something in a set time. Think of it like a bouncer at a club, only letting in a certain number of people per hour. It stops bad actors from flooding your system and keeps things fair for everyone. Why does this matter? Without it, a single user could make thousands of requests and crash your app, or you might blow your budget on cloud costs.

Have you ever noticed how some apps let you try a password only a few times before locking you out? That’s rate limiting in action. It’s not just for security; it helps manage resources so that one heavy user doesn’t ruin the experience for others.

To get started, you’ll need Node.js and Redis installed. I use Redis because it’s fast and handles data across multiple servers easily. Here’s a quick setup for a new project:

npm init -y
npm install express redis ioredis

Now, let’s look at the core algorithms. The token bucket method allows bursts of activity. Imagine you have a bucket that holds 10 tokens, and it refills one token every second. If you use 5 tokens at once, you can do it, but then you have to wait for refills. Here’s a simple implementation:

class TokenBucket {
  constructor(capacity, refillRate) {
    this.capacity = capacity;
    this.refillRate = refillRate;
    this.tokens = capacity;
    this.lastRefill = Date.now();
  }

  take() {
    this.refill();
    if (this.tokens > 0) {
      this.tokens--;
      return true;
    }
    return false;
  }

  refill() {
    const now = Date.now();
    const timePassed = (now - this.lastRefill) / 1000;
    this.tokens = Math.min(this.capacity, this.tokens + timePassed * this.refillRate);
    this.lastRefill = now;
  }
}

But what if you need something smoother? The sliding window algorithm tracks requests in real-time, so it’s fairer than fixed windows that reset abruptly. For example, if you allow 100 requests per minute, it counts requests in the last 60 seconds continuously.

How do you handle this across many servers? Redis is perfect because it stores data in one place all servers can access. Here’s a basic Redis-based rate limiter using the sliding window approach:

const redis = require('redis');
const client = redis.createClient();

async function checkRateLimit(key, maxRequests, windowMs) {
  const now = Date.now();
  const windowStart = now - windowMs;
  
  await client.zremrangebyscore(key, 0, windowStart);
  const requestCount = await client.zcard(key);
  
  if (requestCount < maxRequests) {
    await client.zadd(key, now, now);
    await client.expire(key, windowMs / 1000);
    return { allowed: true, remaining: maxRequests - requestCount - 1 };
  }
  return { allowed: false, retryAfter: Math.ceil((await client.zrange(key, 0, 0))[0] + windowMs - now) / 1000) };
}

I integrated this into Express middleware to protect routes easily. It checks each request and blocks if the limit is hit. You can customize it per user, IP, or any other identifier.

What about monitoring? I added metrics to track how often limits are hit, using tools like Prometheus. This helps you see if your limits are too strict or too loose. For instance, logging when a user gets rate limited can alert you to potential issues.

Testing is crucial. I write unit tests for the algorithms and integration tests with real Redis instances. Simulate high traffic to ensure it holds up under pressure.

In production, set sensible defaults based on your app’s needs. Start conservative and adjust as you monitor usage. Remember, rate limiting should protect without frustrating legitimate users.

I hope this guide helps you build a system that keeps your app safe and responsive. If you found this useful, please like, share, and comment with your experiences or questions. Let’s learn from each other!