There are two general strategies for consumer group membership in Apache Kafka®: static and dynamic. Static membership came about as a practical solution to a problem encountered when developers use dynamic membership in rolling upgrades: if you have a consumer group and you want to do a rolling upgrade, you have to bring each consumer down (triggering a rebalance) and then restart it (triggering another rebalance). That means that the number of rebalances you have to live with when you use a rolling upgrade is double the number of consumers. Consequently, a rolling upgrade can take a long time.
Luckily, static membership was implemented and with the right configuration this is no longer a problem. But how does it actually work? To start, let’s review what happens when a new consumer joins a group, triggering a rebalance.
When a new consumer joins a group, the first step is the broker’s recognition of the new consumer. In the second step, the broker adjusts the consumer group state from RUNNING
to PREPARE_REBALANCE
. Once the consumer group reacts to the change in state, the final state that the broker introduces is COMPLETING_REBALANCE
. (Afterwards, the state is restored to RUNNING
once more.)
Now, what pieces of information does the broker use to track all this? Well, in each type of consumer membership, there’s a member.id
that ensures a unique identity for each group member.
In a dynamic membership, at the point in a rebalance when a new consumer sends aJoinGroupRequest with a special UNKNOWN_MEMBER_ID
, the broker receives that UNKNOWN_MEMBER_ID
, sees that consumer as a new member of the group, and generates a new ID for it.
During a client restart, this happens for all members of the group because when the members send their JoinGroupRequests
, an UNKNOWN_MEMBER_ID
is included. The member.id
is not persisted, so a rebalance is triggered because the broker reads these members as new members.
KIP-345 (which is a Kafka Improvement Proposal, introducing the concept of static membership to the Kafka community) notes:
“When the service state is heavy, a rebalance of one topic partition from instance A to B means a huge amount of data transfer.” — KIP-345
In this context, a “heavy” service state means that an app that has built up a large amount of local state—perhaps a large KTable
. If the partition is reassigned, then that state must be saved and transferred to a new node, and if there’s a lot of data, the process will be slow.
Now, this process can and should be avoided altogether if the partition assignment remains unchanged, or has the potential to remain unchanged, before and after the rebalance. But how to execute this task? This is where static membership comes into play.
What is static membership, and how does it solve the problem encountered by heavy-state applications with dynamic membership?
A consumer with static membership is a consumer with a configured group.instance.id
, in addition to its broker-managed member.id
. The group.instance.id
persists the identity of the consumer over a rebalance.
So, what would happen if a client with a group of consumers with static membership needed to restart? In this case, since the broker recognizes the group.instance.id
, it would not need to assign new membership, and a rebalance can be avoided. This makes heavy-state applications much more efficient, and it also makes rolling upgrades much faster. Since the consumers’ identity as part of the original groups is persisted by the group.instance.id
, they no longer have to rebalance on removal and re-entry into the group. The only lag involved will be from the restarted node.
To implement static membership in your consumer group, set the group.instance.id
value in your configuration. Here’s an example of what that might look like, using the node-rdkafka client:
group.instance.id
vs. group.id
You might have noticed in the example above that the consumer is instantiated with both a group.id
and a group.instance.id
. What’s the difference? A group.id
is what determines which group a consumer belongs to. You can read more about group IDs in our blog post Configuring Apache Kafka Group IDs.
Alternatively, a group.instance.id
should be different for each member of the consumer group, because it encapsulates the member’s identity as a consumer instance. The broker maps the group.instance.id
to each member.id
to ensure each consumer’s unique identity. The member.id
serves as extra validation in the case of static membership.
This article explains the philosophy behind static membership and shows you how to implement it using configuration to speed up your rebalances. But there’s a latent question here: How does Kafka “know” that it has to rebalance a consumer group? You might consider direct addition or removal of a consumer, but what about when a consumer dies? Our next post in the series will dive into how Kafka brokers detect a dead consumer and how you can configure the broker for high performance in that scenario.
In addition, we recommend the following resources:
Consult the documentation, Consumer configuration
Read the original KIP which includes an extensive defense for the change, KIP-345
View this Kafka Summit talk by Boyang Chen and Liquan Pei, Static Membership: Rebalance Strategy Designed for the Cloud
If you’re interested in more blog posts in this series, check out Apache Kafka Beyond the Basics: Windowing, and Co-Partitioning with Apache Kafka
It’s not difficult to get started with Apache Kafka®. Learning resources can be found all over the internet, especially on the Confluent Developer site. If you are new to Kafka, […]
Every developer who uses Apache Kafka® has used a Kafka consumer at least once. Although it is the simplest way to subscribe to and access events from Kafka, behind the […]