I have some questions regarding Kafka streams and how they work, I am experienced with general Kafka consumers and producers paradigm, however, this is my first time I am trying to use Kafka streams.
Questions:
- In general Kafka consumer model we subscribe to a topic and start consuming from a partition, for simplicity sake lets say we have 1 partition and 1 consumer, now if we want to increase our processing we increase the number of partitions and additionally add more consumers, how does this work in Kafka Streams, if we increase partitions how should we scale up the app, do we need to add more machines or do we need to do something else?
- When I am consuming data via Kafka consumers I may end up doing something with the messages, say for example, I may query an API download the file and write to an NFS and forward the message or write the incoming message value to a database and then forward the notification into another Kafka topic, how is the same use case solved, where we are not following the paradigm of
KAFKA -> KAFKA
but instead haveKAFKA -> PROCESS(STORE IN DB) -> KAFKA
, can Kafka Streams even solve this use case? - Lastly, how are exceptions handled and how are offsets managed. In an ever running production systems where there is an endless stream of messages that are coming, in case of any exceptions, say because of any network outage, we shutdown the consumers and do clean bring up. How to achieve the same with Kafka Stream processing app?