I need to find a way (if any) to accelerate indexing speed.
Currently with my cluster setup, which includes 8 storage optimized data nodes and 2 memory optimized master nodes, it takes approximately 20 hours for the data to be indexed.
The data volume gets relatively large(~1TB) when stored in shards.
All the nodes are up and running on AWS EC2 instances and only the master nodes are connected to a load balancer(ALB) from which all queries to Elasticsearch come through, so all bulk indexing queries go to this load balancer, then one of master nodes and finally the data nodes.
The following is set before bulk indexing
[Cluster]
- 8 storage optimized dedicated data nodes
- 2 memory optimized dedicated master nodes
[Index]
- number_of_shards: 6
- number_of_replicas: 0
- refresh_interval: -1
Is there any way to improve the indexing performance of cluster with this settings?