1
votes

I need to know if Apache NiFi supports running processors until completion.

"the execution of a series of processors in process group wait for anothor process group results execution to be complete".

For example:

Suppose there are three processors in NiFi UI.

    P1-->P2-->P3
    P-->Processor

Now I need to run P1 if it run completely then run P2 And finally it will run like sequence but one wait for another to be complete.

EDIT-1:

Just for example I have data in web URL. I can download that data using GetHTTP Processor. Now I stored that in putFile content. If file saved in putFile directory then run FetchFile to process that file into my database like below workflow.

GetHTTP-->PutFile-->FetchFile-->DB

Is this possible?

1

1 Answers

12
votes

NiFi itself is not really a batch processing system, it is a data flow system more geared towards continuous processing. Having said that, there are some techniques you can use to do batch-like operations, depending on which processors you're using.

The Split processors (SplitText, SplitJSON, etc.) write attributes to the flow files that include a "fragment.identifier" which is unique for all splits created from an incoming flow file, and "fragment.count" which is the total number of those splits. Processors like MergeContent use those attributes to process a whole batch (aka fragment), so the output from those kinds of processors would occur after an entire batch/fragment has been processed.

Another technique is to write an empty file in a temp directory when the job is complete, then a ListFile processor (pointing at that temp directory) would issue a flow file when the file is detected.

Can you describe more about the processors in your flow, and how you would know when a batch was complete?