If you have 10 datanodes on an existing Hadoop cluster could you install NiFi on 4 or 6 datanodes?
The main purpose of NiFi would be loading data daily from RDBMS to HDFS, high volume.
Datanodes would be configured with high RAM lets say 100GB. External 3 node Zookeeper cluster would be used.
- Are there any major concerns with this approach?
- Does it make more sense to just install NiFi on EVERY datanode, so 10?
- Are there any issues with having a large cluster of 10 nifi nodes?
- Will some NiFi configuration best practices conflict with Hadoop config?
Edit: Currently using Hortonworks version 2.6.5 and open source NiFi 1.9.2