Kafka with KRaft (Kafka Raft)

                                                            image credit Kafka

1. What is Kafka?

Kafka is a distributed event streaming platform designed for high-throughput, fault-tolerant, real-time data streaming. It is used for publish-subscribe messaging, event sourcing, log processing, and real-time analytics.

Key Features:

  • Scalability: Distributes data across multiple brokers.
  • Durability: Stores data persistently.
  • High Throughput: Handles millions of messages per second.
  • Fault Tolerance: Replicates data across nodes.

Core Components:

  • Producers: Send messages (events) to Kafka.
  • Topics: Logical channels where messages are stored.
  • Partitions: Sub-divisions of topics for parallel processing.
  • Consumers: Read messages from topics.
  • Brokers: Kafka servers that store and manage data.
  • Zookeeper (or KRaft in newer versions): Manages metadata and leader election.

2. How Kafka Works (Step-by-Step)

Step 1: Producers Publish Messages

  • Producers send messages to a topic.
  • Kafka distributes messages among partitions (using round-robin or key-based distribution).

Step 2: Brokers Store Messages

  • Brokers receive data and store it in partition logs.
  • Messages persist based on retention policies.

Step 3: Consumers Subscribe to Topics

  • Consumers read messages sequentially from partitions.
  • Kafka maintains an offset (position of last-read message).
  • Consumers can commit offsets for fault tolerance.

Step 4: Message Processing

  • Consumers process messages and may send them to other Kafka topics, databases, or services.

Example: Kafka in Action

Imagine an e-commerce platform:

  • Order Service (Producer) sends order events to orders-topic.
  • Inventory Service (Consumer) updates stock.
  • Shipping Service (Consumer) prepares delivery.

3. What is Zookeeper in Kafka?

Apache Zookeeper is a distributed coordination service used by Kafka for:

  1. Broker Metadata Management: Keeps track of brokers in the cluster.
  2. Leader Election: Ensures high availability by selecting leader brokers.
  3. Topic and Partition Management: Stores information about topics and their partitions.

Limitations of Zookeeper:

  • Single Point of Failure: Zookeeper failure affects Kafka.
  • Operational Overhead: Requires separate setup and maintenance.
  • Scalability Issues: Slower performance in large clusters.

                                    image credit confluent

4. KRaft (Kafka Raft) – Replacing Zookeeper

Kafka Raft (KRaft) is a Zookeeper-free mode introduced in Kafka 2.8+ and production-ready in Kafka 3.3+.

Why KRaft?

  • Removes dependency on Zookeeper.
  • Improves scalability: No separate cluster needed.
  • Simplifies deployment: Kafka manages its metadata internally.

How KRaft Works:

  • Uses the Raft consensus algorithm for leader election and replication.
  • One broker acts as the Controller to manage metadata.
  • Other brokers synchronize metadata through replicated logs.

5. How to Set Up Kafka with KRaft (Without Zookeeper)

Step 1: Install Kafka

Download Kafka:

wget https://downloads.apache.org/kafka/3.5.0/kafka_2.13-3.5.0.tgz
tar -xzf kafka_2.13-3.5.0.tgz
cd kafka_2.13-3.5.0

Step 2: Generate KRaft Metadata

Run:

bin/kafka-storage.sh format -t $(uuidgen) -c config/kraft/server.properties

This initializes metadata storage.

Step 3: Configure Kafka for KRaft

Edit config/kraft/server.properties:

process.roles=controller,broker
node.id=1
controller.quorum.voters=1@localhost:9093
listeners=PLAINTEXT://:9092,CONTROLLER://:9093
log.dirs=/tmp/kafka-logs

Step 4: Start Kafka (Without Zookeeper)

bin/kafka-server-start.sh config/kraft/server.properties

Step 5: Create a Topic

bin/kafka-topics.sh --create --topic test-topic --bootstrap-server localhost:9092 --partitions 3 --replication-factor 1

Step 6: Start a Producer

bin/kafka-console-producer.sh --topic test-topic --bootstrap-server localhost:9092

Enter messages.

Step 7: Start a Consumer

bin/kafka-console-consumer.sh --topic test-topic --from-beginning --bootstrap-server localhost:9092

Conclusion

  • Kafka is a high-throughput event streaming system.
  • Zookeeper was used for metadata management but had scalability issues.
  • KRaft (Kafka Raft) replaces Zookeeper with an internal Raft-based system.
  • Setting up Kafka with KRaft is easier and more scalable.

 You can find more article and tutorial about Kafka here.

Comments

Popular posts from this blog

COBOT with GenAI and Federated Learning

Self-contained Raspberry Pi surveillance System Without Continue Internet

AI in Education: Embracing Change for Future-Ready Learning