Membership of consumers in a consumer group is coordinated by a designated Kafka broker referred to as the group coordinator. It receives heartbeats from consumers to confirm that they are alive and healthy. When the group coordinator doesn’t hear from a consumer for a configurable period of time, it assumes the particular consumer has crashed and triggers a partition rebalance. The latest version of Kafka has a dedicated heartbeat thread that sends heartbeats to the group coordinator periodically.
A partition rebalance assigns the partitions that the dead consumer was reading from to the remaining healthy consumers in a consumer group. The process of changing ownership of a partition from one consumer to another belonging to the same consumer group is known as a partition rebalance. A rebalance can also happen when partitions are added to a topic. Rebalancing provides for high availability and scalability but under normal circumstances is generally undesirable. While a rebalance is in progress, the partitions are unavailable for consumers to read. When a consumer gracefully leaves a consumer group, it informs the coordinator of its intentions with the close call. The group coordinator doesn’t have to wait to detect that a consumer has left and can initiate a rebalance immediately. Another side-effect of a rebalance is that consumers may have to invalidate their caches or any other maintained state as they’ll now be reading from different partitions.
The assignment of partitions to consumers during a rebalance is done by the consumer designated as the group leader. Consumers join a consumer group by making a JoinGroup request to the coordinator. The first one to do so is also set as the consumer leader. The consumer leader receives the list of other healthy consumers within the consumer group from the group coordinator and makes partition assignments. These assignments are sent to the group coordinator which in turn distributes the assignments to consumers. Each consumer can only see its own assignments and those of the other consumers in the consumer group. The consumer leader is the only client process which has the list of assignments for all the consumers within the group. The partition assignment logic is decided by an implementation class of the PartitionAssignor interface.
Below is a pictorial representation of partition reassignment when one of the consumers in a consumer group goes down. The group consists of two consumers reading messages from three partitions of a topic.