Apache Kafka®️ 비용 절감 방법 및 최적의 비용 설계 안내 웨비나 | 자세히 알아보려면 지금 등록하세요

Sep 29, 2023읽기 시간: 4 min

An Introduction to Apache Kafka Consumer Group Strategy

작성자:

Lucia CerchieSenior Software Engineer

Sep 29, 2023읽기 시간: 4 min

Issues with broker load

Ever dealt with a misbehaving consumer group? Imbalanced broker load? This could be due to your consumer group and partitioning strategy!

Once, on a dark and stormy night, I set myself up for this error. I was creating an application to demonstrate how you can use Apache Kafka® to decouple microservices. The function of my “microservices” was to create latte objects for a restaurant ordering service. It was set up a little like this:

I wanted to implement this in Kafka by using consumers, each reading from a common coffee topic, but with their own partition. Now this was a naive approach. Why?

Well, let’s say it’s pumpkin spice latte season. In the United States, that’s a period of time lasting from August to December during which the pumpkin spice coffee flavor undergoes overwhelming demand at cafes. This would mean the pumpkin spice-flavored coffee order events would dramatically increase.

Since I had each consumer in the same group for one coffee topic, every coffee ‘microservice’ consumed only from a single partition, ignoring Kafka’s mechanism for parallelism. Eventually, I can expect a hot spot in my pumpkin spice partition, which means that some of my brokers would be slow to distribute their pumpkin spice—and others would not be used to their full potential.

How do we solve this issue? Instead of having every consumer in one group, and organizing the different latte flavors by message key, I could send each latte flavor to its own multithreaded topic, and the load for latte orders would then be balanced across partitions, like so:

Now, technically, I could keep all the consumers in the same group, consuming from different topics. But in this case, it’d be best practice to give each consumer its own group. That way, I could scale these topics separately by adding more consumers to a specific group.

Let’s step back a little bit and consider consumer group strategy in general.

What is the function of a consumer group?

One of the more foundational concepts of consumer groups is that each consumer, once assigned to a group, shares the workload. You can read more about this in our blog post about configuring consumer Group IDs, but basically, when each consumer has the same Group ID, they cannot read from the same partition:

Now, you don’t always need to reach for a single consumer group, like I had done at first in my latte project… think through your use case! Sometimes, you may have two consumers (each in different consumer groups, allowing for parallel processing) for two different services, like a customer address service and a customer delivery notification service, that would need to read from the same partition in the same topic. These two consumers in different groups reading from the same topic can pick up reading from different offsets, which they would not be able to do if Kafka used a queue rather than a persistent log.

However, if you do want your consumers to read from the same topic, you need to consider a couple of things: 1) the number of consumers in your group and 2) the number of partitions. This consideration is important because the consumers will share the workload as equally as possible among themselves. Sometimes making this decision is easy—I’m writing a demo app in Kafka Streams with 1 consumer and 1 partition right now because I just need to show developers how to use a .process() function in Kafka Streams. But in most use cases, this decision takes careful consideration.

Considerations in favor of fewer consumers in relation to a high number of partitions include things like how high you want your consumer throughput to be since each partition is given one thread per consumer. You might also want room to increase parallelism later on.

Considerations against having fewer consumers to a higher number of partitions include things like higher unavailability in the case of unclean failure since leader election will take longer.

If you’re interested in reading more about consumer strategy in relation to partitions, Jun Rao has an excellent blog post on the subject.

Exactly how do I configure my consumer group?

Let’s take a look at what consumer group configuration looks like in JS, Java, and Python.

Here’s the example of consumer group configuration in KafkaJS that we saw above:

export function createConsumerConfigMap(config, groupid) {
  if (config.hasOwnProperty('security. protocol')) {
    return {
      'bootstrap.servers': config['bootstrap.servers'],
      'sasl.username': config['sasl.username'],
      'sasl.password': config['sasl.password'],
      'security.protocol': config['security.protocol'],
      'sasl.mechanisms': config['sasl.mechanisms'],
      'group.id': 'consumer-group-id',
    }
  } else {
    return {
      'bootstrap.servers': config['bootstrap.servers'],
      'group.id': 'consumer-group-id',
    }
  }
}

In this example, the group.id is set when the function that creates the consumer is called.

In the Python Kafka client, you might do something like this in a config.properties file:

[consumer]
auto.offset.reset=earliest
group.id=consumer-group-id
enable.auto.commit=true
max.poll.interval.ms=3000000

Then you could import those properties whenever you’re creating a consumer.

Now, if you’re using Java to create a Kafka Streams app, it gets a little trickier. That’s because the Group ID configuration on the consumer is not called group.id or something similar that you might expect. Instead, the APPLICATION_ID_CONFIG is what Kafka Streams ends up using as the consumer Group ID.

props.put(StreamsConfig.APPLICATION_ID_CONFIG, "consumer-group-id");

Where to go from here

I’m hoping this blog post gave you something to think about when it comes to consumer group strategy, whether you’re new to configuring consumer groups or designing a large piece of architecture. If you’d like to learn more, here are some resources I recommend:

Kafka Consumer Groups Tool: Documentation on using Kafka’s consumer groups tool to check metadata on groups
Configuring Kafka Consumer Group IDs: Learn more about consumer Group IDs, specifically
Here’s a video introduction to Kafka Consumers
The Kafka Consumer Group Protocol is about to get simpler—read KIP-848 for the details

Lucia Cerchie is a Senior Software Engineer at Confluent. She believes in a human-centered developer experience and in the joy of learning. She blogs at her personal site, https://luciacerchie.dev/blog/, and here on Confluent, https://www.confluent.io/blog/author/lucia-cerchie/

이 블로그 게시물이 마음에 드셨나요? 지금 공유해 주세요.

What Are Apache Kafka® Consumer Group IDs?

Dec 6, 2022

Learn what a Kafka® consumer group ID is and how assigning one to Kafka consumers during configuration helps with detecting new data, work sharing, and data recovery.

Lucia Cerchie

From Dumb Pipes to a Smart Data Plane: Introducing Schema IDs in Apache Kafka® Headers

Mar 10, 2026

Confluent’s Schema IDs in headers transform Kafka from "dumb pipes" to a "smart data plane." By moving metadata out of payloads, teams can schematize topics without breaking legacy apps or requiring big-bang migrations. This unlocks governed, AI-ready data for Flink and lakehouses with ease.

David Araujo

An Introduction to Apache Kafka Consumer Group Strategy

The Confluent Developer Newsletter

Course: Apache Kafka 101

작성자:

Issues with broker load

What is the function of a consumer group?

Exactly how do I configure my consumer group?

Where to go from here

The Confluent Developer Newsletter

Course: Apache Kafka 101

이 블로그 게시물이 마음에 드셨나요? 지금 공유해 주세요.

What Are Apache Kafka® Consumer Group IDs?

From Dumb Pipes to a Smart Data Plane: Introducing Schema IDs in Apache Kafka® Headers

Issues with broker load

What is the function of a consumer group?

Exactly how do I configure my consumer group?

Where to go from here

The Confluent Developer Newsletter

Course: Apache Kafka 101

이 블로그 게시물이 마음에 드셨나요? 지금 공유해 주세요.

Confluent 블로그 구독

What Are Apache Kafka® Consumer Group IDs?

From Dumb Pipes to a Smart Data Plane: Introducing Schema IDs in Apache Kafka® Headers