I am using a Apache nifi for one of my clickstream projects to do some ETL.
I am getting traffic around 300 messages per second currently with the following infra:
- RAM - 16 GB
- Swap - 6 GB
- CPU - 16 cores
- Disk - 100GB (Persistance not required)
- Cluster - 6 nodes
The entire cluster UI has become extremely slow with the following issues
- Processors giving back pressure when some failure happens, which consumes lot of threads
- Provenance writing becomes very slow
- Heartbeat across nodes becomes slow Cluster Heart beat
I have the following questions on the setup
- Is RPG use recommended, as it is a HTTP call, which i using to spread across all the nodes, as there is an existing issue with EMQTT process for consumer group.
- What is the recommended value of thread count that should be allotted per core?
- What are the guidelines for infrastructure sizing
- What are the tuning parameters for a large cluster with high incoming requests and lot of heavy JSON parsing for transformation