Do we need to think about underlying cluster while designing nifi templates?
Here is my simple flow
+-----------------+ +---------------+ +-----------------+
| | | | | |
| READ FROM | | MERGE | | PUT HDFS |
| KAFKA | | FILES | | |
| +-----------------------> | +---------------------> | |
| | | | | |
| | | | | |
| | | | | |
+-----------------+ +---------------+ +-----------------+
I have 3 nodes cluster.. When system is running I check "cluster" menu and see only master node is utilizing sources, other cluster nodes seems idle... The question is in such a cluster should I design template according to cluster or nifi should do the load balancing.
I saw one of my colleagues created remote processors for each node on cluster and put a load balancer in front of these within template, is it required? (like below)
+------------------+
| | +-------------+
| REMOTE PROCESS | | input port |
+----> | GROUP FOR | | (rpg) |
| | NODE 1 | +-------------+
| | | |
| | | |
| +------------------+ v
+-----------------+ +-----------------+ RPG
| | | | | +--------------+
| READ FROM | | | | | |
| KAFKA | | LOAD BALANCER | | +------------------+ | MERGE FILES |
| +-------------> | +-------------> | | | |
| | | | | | REMOTE PROCESS | | |
| | | | | | GROUP FOR | | |
| | | | | | NODE 2 | | |
+-----------------+ +-----------------+ RPG | | +--------------+
| +------------------+ |
| |
| v
|
| +-------------------+ +---------------+
| | | | |
| | REMOTE PROCESS | | PUT HDFS |
+-----> | GROUP FOR | | |
| NODE 3 | | |
| | | |
| | | |
+-------------------+ +---------------+
And what is the use-case for load-balancer except remote clusters, can I use load-balancer to split traffic into several processors to speedup the operation?