So far, we have largely skipped the internal working details of a Kafka cluster and its interactions with consumers and producers. In this chapter, we’ll take a closer look at how the different Kafka components work. We’ll start with the Kafka cluster that consists of several brokers working together. Brokers maintain their membership in a cluster via a unique ID that is set either in the configuration file or automatically generated. Each broker creates an ephemeral node in Zookeeper with its ID under the Zookeeper path /brokers/id. Various Kafka components receive notifications when brokers join or leave the cluster by keeping a watch on the path /brokers/id where brokers create ephemeral nodes. A new broker can’t register itself with the same ID as an existing broker. A broker can lose connectivity to Zookeeper for a variety of reasons such as:
broker deliberately stopping
garbage collector pause
network partition
If such a situation occurs, the ephemeral node created by the broker at the time it started is automatically removed from Zookeeper. Kafka components watching the list of brokers are notified that the broker has left. Interestingly, if a brand new broker is spun up with the same ID as the broker that left, the new broker will be assigned the same partitions and topic as the broker that left. This is because even though the broker left the cluster, its ID is still retained within internal data structures. If a new broker comes along with the same ID, those data structures start referring to the new broker.