We are receiving Json messages from upstream system via Kafka topic. Requirement is to store these messages into HDFS at certain interval. Since we are storing into HDFS we want to merge certain number of these Records in to single file. As per NiFi documentation we are using "MergeRecords" processor for that.
About the in coming Records:##
- These are the multi-line JSon messages with nested structure.
- Those are based on the same schema (they are picked from single Kafka topic)
- Those are validated messages and even NiFi processor is able to parse it. so apparently no issues with JSon messages from Schema point of view
Present Configuration
Below is the snapshot of the Processor Configuration. NiFi version: 1.8
Expected behavior
For the Above configuration its expected that MergeRecords should have weighted for one of the thresholds i.e. Maximum records(100000) or Maximum Bean size(100KBs).
Observed Behavior
But its observed that bean is getting bundled pretty before either of the threshold is reached. It is triggering the bean formation only for 2 records of 5KB size.
If you could help with analysis and/or pointers as why MergeRecord processor is not behaving as per the configuration?
