You're asking very general questions about how Kafka was designed. For this, there's no better place to look than the official Kafka documentation:
https://kafka.apache.org/documentation.html#impl_zookeeper
The official controller explanation, from the docs (my highlights):
It is also important to optimize the leadership election process as
that is the critical window of unavailability. A naive implementation
of leader election would end up running an election per partition for
all partitions a node hosted when that node failed. Instead, we elect
one of the brokers as the "controller". This controller detects
failures at the broker level and is responsible for changing the
leader of all affected partitions in a failed broker. The result is
that we are able to batch together many of the required leadership
change notifications which makes the election process far cheaper and
faster for a large number of partitions. If the controller fails, one
of the surviving brokers will become the new controller.
https://kafka.apache.org/documentation.html#design_replicamanagment
So, the controller is the broker that is responsible for monitoring the leaders of each of the topic/partitions. If one or more of those leader topic/partitions becomes unavailable, the controller then runs leader elections, and appoints the new leader for each of those topic/partitions, so that clients (both consumers and producers) can consume/produce there.
The reason why you can't see anything under /controller
is because you're assuming it's a "directory", when in reality it's a znode with information. You need to issue a get /controller
command to see the output. You should see something like this:
[zk: s1:2181(CONNECTED) 1] get /controller
{"version":1,"brokerid":100,"timestamp":"1506197069724"}
cZxid = 0xf9
ctime = Sat Sep 23 22:04:29 CEST 2017
mZxid = 0xf9
mtime = Sat Sep 23 22:04:29 CEST 2017
pZxid = 0xf9
cversion = 0
dataVersion = 0
aclVersion = 0
ephemeralOwner = 0x15eaa3a4fdd000d
dataLength = 56
numChildren = 0