0
votes

I have a scenario where my kafka messages(from same topic) are flowing through single enrichment pipeline and written at the end into HDFS and MongoDB. My Kafka consumer for HDFS will run on hourly basis(for micro-batching). So I need to know the best possible way to route flowfiles to putHDFS and putMongo based on which consumer it is coming from(Consumer for HDFS or consumer for Mongo DB).

Or please suggest if there is any other way to achieve micro-batching through Nifi.

Thanks

1

1 Answers

0
votes

You could set Nifi up to use a Scheduling Strategy for the processors that upload data.

And I would think you want the Kafka consumers to always read data, building a backlog of FlowFiles in NiFi, and then having the puts run on a less-frequent basis.


This is similar to how Kafka Connect would run for its HDFS Connector