6
votes

I have read thread boostrap-server vs zookeeper in consumer console but that's not clear my doubt.

My doubt is, let say we have zookeeper running at localhost:2181, three broker servers are running at localhost:9092, localhost:9093, localhost:9094 and we have one topic my_topic with partition 3 and replication 1 and topic is shared by brokers because it has three partitions.

In new version of Apache-Kafka when we are running consumer console so we need to pass --bootstrap-server localhost:9092 which is one of broker address but in earlier version we are passing zookeeper address.

So when we are running consumer to consume message from the topic my_topic, we are passing parameter --bootstrap-server localhost:9092 which is nothing but one of the broker address, So my question is, are we restricting consumer that you have to consume messages only from that broker and if it is than let say if that broker is down itself, so how consumer will read the messages from that topic. I didn't understand how is it working, may someone please clear it.

Older Version command for run consumer(< 1.0)
bin/kafka-console-consumer.sh --zookeeper localhost:2181 --from- beginning --topic my_topic
Newer version command for run consumer( >= 1.0)
bin/kafka-console-consumer.sh --bootstrap-server localhost:9092 --from-beginning --topic my_topic

1
This "new version" of bootstrap-server has been around for at least 3 years now, so if you've found documentation that uses --zookeeper for the CLI commands, it is relatively old in itselfOneCricketeer
Also, running three brokers on a single machine does not offer any performance improvements or fault tolerance. You still have maybe one disk, one host, and all processes are sharing the same network, memory space (but separate JVM), etc... You can still have 3 partitions with a single broker, just not 3 replicasOneCricketeer
Additionally, the idea of multiple brokers is, against the bootstrap-server flag better put all brokers as comma separated (same in application code as well), if one broker is unavailable next in list will be tried. As @cricket_007 mentioned, in your current case it may not help since all brokers are in same machine, typically they are spread out having different IPs / hostnames.AbhishekN
@AbhishekN Even with Zokeeper, you can comma-delimit multiple of them. That wasn't my point, thoughOneCricketeer
the 'old' version happens to be what apache has on their documentation tutorials for kafka quickstart -- kafka.apache.org/21/documentation/streams/quickstartbeauXjames

1 Answers

4
votes

In the previous Kafka version (before 0.9.0), the consumer needed the connection to Zookeeper for committing the offset and for getting topics metadata as well. Starting from 0.9.0, the consumer offeset is saved in a Kafka topic (__consumer_offset) and the connection to Zookeeper is not needed anymore.

What you specify in the --bootstrap-server parameter is exactly what the name .. says. It's a bootstrap servers list: it means that the consumer connects to brokers you specify and ask for metadata about the topics it wants to consume. It's not limited to consume messages only from brokers listed in the --bootstrap-server parameter. Let's say you specify "kafka1:9092" as bootstrap server (in a cluster where you have 3 brokers as you said). After connecting, the consumer sends a metadata request for getting information about "my_topic". The "kafka1" server could reply "I am not the leader for partition 0 of my_topic, here the broker which is leader for that kafka2". At this point, the consumer connects to "kafka2" broker for starting to get messages.