1
votes

The flow files are stuck in the queue(Load balance by attribute) and are not read by the next downstream processor(MergeRecord with CSVReader and CSVRecordSetWriter). From the Nifi UI, it appears that flow files are in the queue but when tried to list queue it says "Queue has no flow files". Attempting to empty queue also gives the exact message. Nifi Logs doesn't have any exceptions related to the processor. There are around 80 flow files in queue.

I have tried below action items but all in vain:

  • Restarting the downstream and upstream(ConvertRecord) processor.
  • Disabled and enabled CSVReader and CSVRecordSetWriter.
  • Disabled load balancing.
  • Flow file expiration set to 3 sec.

Screenshot: Flowfile: enter image description here MergeRecord properties: enter image description here CSVReader Service: enter image description here CSVRecordSetWriter: enter image description here

3
screenshot would be helpful. - daggett
@daggett added screenshot. Let me know if you need anything else. - Suman
also please show the parameters of MergeRecord - daggett
Added all details. - Suman
You likely need to upgrade to 1.9.0 to resolve some issues with load-balanced connections. - Bryan Bende

3 Answers

0
votes

This is probably because the content of the flowfile was deleted. However, the entry of it is still present in the flow-file registry.

if you have a dockerized nifi setup and if you dont have a heavy production flow, you can stop your nifi flow and delete everything in the _*repository folders (flowfile-repository, content repository etc) (provided you have all you directories mounted and no other data loss is at risk)

Let me know if you need further assistance

0
votes

Your merge record processor is running only on the primary node, and likely all the files are on other nodes (since you are load balancing). NiFi is not aware enough to notice that the downstream processor is only running on the primary, so it does not automatically rebalance everything to the primary node. Simply changing the MergeRecord to run on all nodes will allow the files to pass through.

Alas, I have not found a way to get all flow files back on the primary node, you can use the "Single Node" load balance strategy to get all the files on the same node, but it will not necessarily be the primary.

0
votes

You have a miss configuration in the way you load balance your FlowFiles. To check that stop your MergeRecord processor to be able to check and view what's inside your queue.

In the modal window displayed you can check where are your flowfiles waiting, it's highly probable that your FlowFiles are in fact on one of the other node(s) but since the MergeRecord is running on the primary node then it has nothing in its Queue.