How to Build a Reliable Data Pipeline with Kafka, Redis, and Prisma

js

How to Build a Reliable Data Pipeline with Kafka, Redis, and Prisma

Learn how to build a reliable data pipeline with Kafka, Redis, and Prisma for scalable event processing, deduplication, and storage.

Apr 8, 2026

Ever tried to push water uphill with a sieve? That’s what building a data pipeline can feel like. Events pour in from everywhere—clicks, logs, transactions—and you need to sort, clean, and store them across different systems without losing a drop. I spent last week watching this exact struggle play out in a team’s chat, sparking the idea for this walkthrough. Let’s build something better, together.

Think about the last time you bought something online. That single click likely created a cascade of events: updating inventory, charging a payment method, sending a confirmation. Now imagine handling thousands of those clicks every second. Where do you even start?

I start with a simple truth: order creates control. We need a clear path for our data to travel. This path begins with a robust entry point, a place where all those random events can line up and wait their turn. For this, I use Apache Kafka. It’s less of a database and more of a highly organized, fault-tolerant log. Events go in one end and wait patiently to be processed. But how do we get them in there?

First, you need a producer. This is the code that takes an event from your application and publishes it to a Kafka “topic,” which is just a fancy name for a categorized log stream. Here’s a stripped-down version of what that looks like.

import { Kafka } from 'kafkajs';

const kafka = new Kafka({
  clientId: 'my-app-producer',
  brokers: ['localhost:9092']
});

const producer = kafka.producer();
await producer.connect();

const event = {
  userId: 'user_12345',
  action: 'purchase',
  itemId: 'item_987',
  timestamp: new Date().toISOString()
};

await producer.send({
  topic: 'raw-user-events',
  messages: [
    { value: JSON.stringify(event) }
  ],
});

This code takes a simple JavaScript object, turns it into a JSON string, and sends it to a topic called raw-user-events. It’s now in the pipeline. But what good is data just sitting in a log? We need to do something with it.

That’s where the consumer comes in. It subscribes to a topic and processes each message. But here’s the first big question: what happens if your consumer crashes while processing, and then restarts? How do you avoid handling the same event twice? This problem, called duplicate processing, can cause huge issues, like charging a customer multiple times.

The solution is idempotency. It means making your operation safe to repeat. A common way to achieve this is by using a fast cache, like Redis, to check if you’ve seen an event before. Every event gets a unique ID. Before processing, you ask Redis: “Have I done this one?” If yes, you skip it.

import Redis from 'ioredis';
const redis = new Redis();

async function processEvent(message) {
  const event = JSON.parse(message.value);
  const eventId = event.id; // A unique UUID from the producer

  // Check for duplicate
  const isDuplicate = await redis.get(`event:${eventId}`);
  if (isDuplicate) {
    console.log(`Skipping duplicate event: ${eventId}`);
    return; // Do nothing
  }

  // Process the new event here...
  await saveToDatabase(event);

  // Mark it as processed, expire the key in 24 hours
  await redis.setex(`event:${eventId}`, 86400, 'processed');
}

This simple check acts as a gatekeeper, preventing chaos. Now our data is flowing in and being processed reliably. But we’re just writing it to “a database.” The original challenge mentioned multiple databases. Why would you do that?

Different data has different jobs. Some information, like a user’s final order total, needs strict correctness and fits neatly into rows and columns—a job for a relational database like PostgreSQL. Other data, like the complete, raw event with all its nested details, is better stored as a flexible document in something like MongoDB. The trick is using one tool to talk to both.

This is where Prisma shines. It lets you define your data structures in one schema file, but connect to different database engines. You write your queries the same way, and Prisma handles the translation. It brings type-safety, meaning your code editor will scream at you if you try to save a string where a number should go.

// In your schema.prisma file
generator client {
  provider = "prisma-client-js"
}

datasource db {
  provider = "postgresql"
  url      = env("POSTGRES_URL")
}

datasource mongodb {
  provider = "mongodb"
  url      = env("MONGODB_URL")
}

model Order {
  id        String   @id @default(cuid()) @db.Postgres
  userId    String
  total     Float
  createdAt DateTime @default(now())
}

model EventLog {
  id        String   @id @default(auto()) @map("_id") @db.Mongodb
  eventId   String   @unique
  rawData   Json     // MongoDB is great for flexible JSON
  topic     String
  timestamp DateTime @default(now())
}

With this setup, your processing function can decide where to put data. The final order goes to PostgreSQL. The full audit trail of the event goes to MongoDB.

import { PrismaClient } from '@prisma/client';
const prisma = new PrismaClient();

async function persistEvent(event) {
  // Write to PostgreSQL
  await prisma.order.create({
    data: {
      userId: event.userId,
      total: event.cartTotal,
    },
  });

  // Also write the raw event log to MongoDB
  await prisma.eventLog.create({
    data: {
      eventId: event.id,
      rawData: event,
      topic: 'raw-user-events',
    },
  });
}

Everything seems perfect now, right? Not quite. What happens when something goes wrong? A piece of data might be malformed—a missing field, an invalid date. If your consumer throws an error and stops, your whole pipeline halts. We can’t have that.

The professional fix is a Dead Letter Queue (DLQ). It’s a safety net topic. When a message fails processing after a few retries, you don’t throw it away. You publish it to the DLQ with a note about why it failed. This keeps your main flow moving and gives you a parked area to inspect broken messages later.

async function processWithDLQ(message) {
  try {
    await complexTransformation(message);
  } catch (error) {
    console.error('Processing failed:', error);
    // Send to the Dead Letter Queue for later analysis
    await dlqProducer.send({
      topic: 'dead-letter-queue',
      messages: [{
        value: message.value,
        headers: {
          'error': error.message,
          'originalTopic': message.topic,
          'timestamp': new Date().toISOString()
        }
      }]
    });
  }
}

Suddenly, you have a system. Data flows in, gets checked for duplicates, is processed, split, stored in the right places, and errors are safely quarantined. It’s resilient. You can sleep at night.

This is more than just code. It’s a way of thinking about data as a continuous stream that needs guidance. By combining Kafka’s durability, Redis’s speed, and Prisma’s type-safe multi-database access, you build a pipeline that is both strong and adaptable. It handles scale and shrugs off failure.

Was there a step that made you pause? A concept you’d like to see explained with a different example? Building this is a journey, and I’d love to hear about yours. If this guide helped you connect the dots, please share it with someone else who might be facing the same puzzle. Your thoughts and questions in the comments are what help these ideas grow. Let’s keep the conversation going

As a best-selling author, I invite you to explore my books on Amazon. Don’t forget to follow me on Medium and show your support. Thank you! Your support means the world!

101 Books

101 Books is an AI-driven publishing company co-founded by author Aarav Joshi. By leveraging advanced AI technology, we keep our publishing costs incredibly low—some books are priced as low as $4—making quality knowledge accessible to everyone.

Check out our book Golang Clean Code available on Amazon.

Stay tuned for updates and exciting news. When shopping for books, search for Aarav Joshi to find more of our titles. Use the provided link to enjoy special discounts!

📘 Checkout my latest ebook for free on my channel!
Be sure to like, share, comment, and subscribe to the channel!