I'm currently trying to, easily, stream messages from a Topic on one Kafka cluster to another one (Remote -> Local Cluster).
The idea is to use Kafka-Streams right away so that we don't need to replicate the actual messages on the local cluster but only get the "results" of the Kafka-Streams processing into our Kafka-Topics.
So let's say the WordCount demo is on one Kafka-Instance on another PC than my own. I also have a Kafka-Instance running on my local machine.
Now I want to let the WordCount demo run on the Topic ("remote") containing the sentences which words should be counted.
The counting however should be written to a Topic on my local system instead of a "remote" Topic.
Is something like this doable with the Kafka-Streams API?
E.g.
val builder: KStreamBuilder = new KStreamBuilder(remote-streamConfig, local-streamconfig)
val textLines: KStream[String, String] = builder.stream("remote-input-topic",
remote-streamConfig)
val wordCounts: KTable[String, Long] = textLines
.flatMapValues(textLine => textLine.toLowerCase.split("\\W+").toIterable.asJava)
.groupBy((_, word) => word)
.count("word-counts")
wordCounts.to(stringSerde, longSerde, "local-output-topic", local-streamconfig)
val streams: KafkaStreams = new KafkaStreams(builder)
streams.start()
Thank you very much
- Tim