js

Mastering API Rate Limiting with Redis: Fixed, Sliding, and Token Bucket Strategies

Learn how to implement scalable API rate limiting using Redis with fixed window, sliding window, and token bucket algorithms.

Mastering API Rate Limiting with Redis: Fixed, Sliding, and Token Bucket Strategies

I was building an API for a client last week when I noticed something strange in the logs. A single user was making thousands of requests per minute to an endpoint that should have been called maybe once an hour. It wasn’t malicious, just a bug in their code, but it was enough to slow down the service for everyone else. That moment made me realize how fragile our systems can be without proper controls. It’s not just about stopping bad actors; it’s about creating a fair, stable environment for all users. If you’re building APIs that others depend on, this is a skill you need. Let’s talk about how to do it right.

Think of rate limiting as a traffic light for your API. It tells requests when to go, when to slow down, and when to stop. Without it, you risk everything from server crashes to huge cloud bills. But how do you choose the right method? The answer depends on what you’re protecting.

The simplest approach is the fixed window. Imagine a counter that resets every hour. If you allow 100 requests per hour, the 101st request in that hour gets blocked. It’s easy to understand and implement. But it has a flaw. What if 100 requests come in at 1:59 PM, and another 100 come in at 2:00 PM? You’ve just had 200 requests in one minute, which might still overload your system. This is called the “boundary problem.”

So, we need something smoother. This is where the sliding window comes in. Instead of a fixed block of time, we look at a moving window. If your limit is 100 per hour, we only count requests from the past 60 minutes at any given moment. This gives you a much more accurate picture of real-time traffic. It’s more complex but far more effective for production systems.

Then there’s the token bucket. Picture a bucket that holds tokens. The bucket refills at a steady rate. Each API request costs one token. If the bucket is empty, the request is denied. The clever part is that the bucket can have a capacity larger than the refill rate. This allows for short bursts of traffic, which is perfect for user-facing applications where activity isn’t always perfectly smooth.

For this to work in a real Node.js application with multiple servers, we need a shared state. This is where Redis shines. It’s fast, it’s in-memory, and every server in your cluster can talk to the same Redis instance. This gives us a single source of truth for counting requests. Let’s set it up.

First, we connect to Redis. Using a client like ioredis with connection pooling is a good practice. It manages connections efficiently so we don’t waste resources.

// redisClient.js
const Redis = require('ioredis');

class RedisClient {
  constructor() {
    this.client = new Redis({
      host: process.env.REDIS_HOST,
      port: process.env.REDIS_PORT,
      retryStrategy(times) {
        const delay = Math.min(times * 50, 2000);
        return delay;
      }
    });

    this.client.on('error', (err) => {
      console.error('Redis error:', err);
    });
  }

  getClient() {
    return this.client;
  }
}

module.exports = new RedisClient().getClient();

Now, let’s build a sliding window rate limiter as Express middleware. The key is to use a Redis sorted set. We’ll use timestamps as scores. This lets us easily remove old entries and count how many are left in our time window.

// slidingWindowLimiter.js
const redisClient = require('./redisClient');

async function slidingWindowLimiter(req, res, next) {
  const userId = req.user?.id || req.ip; // Use IP if no user is logged in
  const key = `rate_limit:${userId}:${req.path}`;
  const now = Date.now();
  const windowMs = 60 * 60 * 1000; // 1 hour window
  const maxRequests = 100;

  // Lua script for atomic operations in Redis
  const luaScript = `
    local key = KEYS[1]
    local now = tonumber(ARGV[1])
    local window = tonumber(ARGV[2])
    local max = tonumber(ARGV[3])
    
    -- Remove timestamps older than the window
    redis.call('ZREMRANGEBYSCORE', key, 0, now - window)
    
    -- Count current requests
    local current = redis.call('ZCARD', key)
    
    if current < max then
      -- Allow the request and add its timestamp
      redis.call('ZADD', key, now, now)
      redis.call('PEXPIRE', key, window)
      return {1, max - current - 1} -- allowed, remaining
    else
      -- Deny the request
      local oldest = redis.call('ZRANGE', key, 0, 0, 'WITHSCORES')
      local resetTime = tonumber(oldest[2]) + window
      return {0, 0, resetTime} -- denied, remaining, reset timestamp
    end
  `;

  try {
    const result = await redisClient.eval(
      luaScript,
      1, // number of keys
      key,
      now.toString(),
      windowMs.toString(),
      maxRequests.toString()
    );

    const [allowed, remaining, resetTime] = result;

    // Set helpful headers for the API consumer
    res.setHeader('X-RateLimit-Limit', maxRequests);
    res.setHeader('X-RateLimit-Remaining', remaining);
    if (resetTime) {
      res.setHeader('X-RateLimit-Reset', new Date(resetTime).toISOString());
    }

    if (allowed === 0) {
      return res.status(429).json({
        error: 'Too Many Requests',
        message: `Rate limit exceeded. Try again after ${new Date(resetTime).toISOString()}`,
        retryAfter: Math.ceil((resetTime - now) / 1000)
      });
    }

    next(); // Request is allowed, proceed
  } catch (error) {
    console.error('Rate limiter error:', error);
    // If Redis fails, let the request through. This is a "fail open" strategy.
    // You might choose "fail closed" for stricter security.
    next();
  }
}

module.exports = slidingWindowLimiter;

But what about different user tiers? A free user might get 100 requests per hour, while a premium user gets 10,000. We need a flexible system. We can store user tiers in a database and fetch the limit configuration dynamically.

// tieredLimiter.js
async function tieredLimiter(req, res, next) {
  const userId = req.user.id;
  
  // Fetch the user's plan from your database
  const userPlan = await UserPlan.findOne({ userId });
  const limitConfig = getLimitConfig(userPlan.tier); // e.g., 'free', 'pro', 'enterprise'
  
  // Now use limitConfig.windowMs and limitConfig.maxRequests
  // in the sliding window logic from above...
  // ... (sliding window logic here)
}

Handling bursts gracefully is another challenge. The token bucket algorithm is excellent for this. We can implement it by storing a token count and a last-updated timestamp in Redis.

// tokenBucketLimiter.js
async function tokenBucketCheck(userId, bucketCapacity, refillRate) {
  const key = `token_bucket:${userId}`;
  const now = Date.now();
  
  const luaScript = `
    local key = KEYS[1]
    local now = tonumber(ARGV[1])
    local capacity = tonumber(ARGV[2])
    local refillPerMs = tonumber(ARGV[3]) / 1000 -- refill rate per millisecond
    
    local bucket = redis.call('HMGET', key, 'tokens', 'lastRefill')
    local tokens = tonumber(bucket[1]) or capacity
    local lastRefill = tonumber(bucket[2]) or now
    
    -- Calculate how many tokens have been added since last check
    local timePassed = now - lastRefill
    local refillAmount = math.floor(timePassed * refillPerMs)
    tokens = math.min(capacity, tokens + refillAmount)
    
    if tokens >= 1 then
      -- Consume a token
      tokens = tokens - 1
      redis.call('HMSET', key, 'tokens', tokens, 'lastRefill', now)
      redis.call('PEXPIRE', key, 3600000) // expire in 1 hour
      return {1, tokens} -- allowed, remaining tokens
    else
      return {0, 0} -- denied, no tokens left
    end
  `;
  
  const result = await redisClient.eval(luaScript, 1, key, now, bucketCapacity, refillRate);
  return { allowed: result[0] === 1, remainingTokens: result[1] };
}

What happens when a user hits their limit? Simply returning a 429 error is fine, but we can do better. Consider implementing a queue for certain critical operations, or returning a partial response with a warning. Communication is key. Always use standard headers like X-RateLimit-Remaining and Retry-After so clients can build smart retry logic.

Monitoring is the final, crucial piece. You need to know when limits are being hit and by whom. Log these events and consider setting up alerts for unusual spikes. This data can also help you adjust your limits to better match real usage patterns.

// Simple logging in the limiter
if (!allowed) {
  console.warn(`Rate limit hit for user ${userId} on path ${req.path}`);
  // Send to your monitoring service (e.g., Datadog, Sentry)
  monitoringService.increment('rate_limit.hits', 1, { userId, path: req.path });
}

Building this changed how I see API design. It’s not just about making endpoints available; it’s about managing how they are used. It’s a commitment to reliability and fairness for every person or service that calls your code. Start with a simple fixed window, then move to a sliding window as your needs grow. Use Redis to make it work across your entire infrastructure. Remember, the goal isn’t to say “no” to requests, but to say “yes, in a way that works for everyone.”

Did you find this guide helpful? Have you encountered a rate limiting challenge that required a unique solution? Share your thoughts in the comments below—I’d love to hear about your experiences. If this article helped you build a more robust API, please consider liking and sharing it with other developers in your network.


As a best-selling author, I invite you to explore my books on Amazon. Don’t forget to follow me on Medium and show your support. Thank you! Your support means the world!


101 Books

101 Books is an AI-driven publishing company co-founded by author Aarav Joshi. By leveraging advanced AI technology, we keep our publishing costs incredibly low—some books are priced as low as $4—making quality knowledge accessible to everyone.

Check out our book Golang Clean Code available on Amazon.

Stay tuned for updates and exciting news. When shopping for books, search for Aarav Joshi to find more of our titles. Use the provided link to enjoy special discounts!


📘 Checkout my latest ebook for free on my channel!
Be sure to like, share, comment, and subscribe to the channel!


Our Creations

Be sure to check out our creations:

Investor Central | Investor Central Spanish | Investor Central German | Smart Living | Epochs & Echoes | Puzzling Mysteries | Hindutva | Elite Dev | JS Schools


We are on Medium

Tech Koala Insights | Epochs & Echoes World | Investor Central Medium | Puzzling Mysteries Medium | Science & Epochs Medium | Modern Hindutva

Keywords: api rate limiting,redis,nodejs,sliding window,token bucket



Similar Posts
Blog Image
Build Real-Time Collaborative Document Editor: Socket.io, Redis, Operational Transforms Guide

Learn to build a real-time collaborative document editor using Socket.io, Redis, and Operational Transforms. Master conflict resolution, scaling, and deployment.

Blog Image
How to Build a Scalable Real-time Multiplayer Game with Socket.io Redis and Express

Learn to build scalable real-time multiplayer games with Socket.io, Redis & Express. Covers game state sync, room management, horizontal scaling & deployment best practices.

Blog Image
Complete Event-Driven Microservices Architecture with NestJS, RabbitMQ and MongoDB: 2024 Guide

Learn to build scalable event-driven microservices with NestJS, RabbitMQ & MongoDB. Master CQRS, Saga patterns, and deployment strategies.

Blog Image
Complete Guide to Integrating Next.js with Prisma: Build Type-Safe Full-Stack Applications in 2024

Learn to integrate Next.js with Prisma ORM for powerful full-stack apps. Build type-safe backends with seamless frontend-database connectivity.

Blog Image
Complete Guide to Integrating Next.js with Prisma for Type-Safe Full-Stack Development

Learn to integrate Next.js with Prisma for type-safe full-stack development. Build modern web apps with seamless database operations and React frontend.

Blog Image
Event-Driven Microservices: Complete NestJS, RabbitMQ, MongoDB Guide with Real-World Examples

Learn to build scalable event-driven microservices with NestJS, RabbitMQ & MongoDB. Master async communication, CQRS patterns & error handling for distributed systems.