Is there an elegant way to query a Kafka topic for a specific record? The REST API that I'm building gets an ID and needs to look up records associated with that ID in a Kafka topic. One approach is to check every record in the topic via a custom consumer and look for a match, but I'd like to avoid the overhead of reading a bunch of records. Does Kafka have a fast, built in filtering capability?
19
votes
As Chris points out, Kafka provides some mechanism for retrieving individual records. But I would caution that doing this is not what Kafka is primarily designed for, kafka.apache.org/documentation.html#uses. If you're using Kafka primarily to retrieve individual messages from a topic, using a different software might fit your use case better.
– Morgan Kenyon
Yeah, I've decided to implement a Kafka consumer that writes to a Mongo database, and then my REST API can request the individual records from there. I up voted Chris's answer because it confirmed my suspicion that it wasn't possible (at least elegantly).
– user554481
2 Answers
21
votes
The only fast way to search for a record in Kafka (to oversimplify) is by partition and offset. The new producer class can return, via futures, the partition and offset into which a message was written. You can use these two values to very quickly retrieve the message.
So if you make the ID out of the partition and offset then you can implement your fast query. Otherwise, not so much. This means that the ID for an object isn't part of your data model, but rather is generated by the Kafka-knowledgable code.
Maybe that works for you, maybe it doesn't.