2
votes

I made a ELK STACK with 3 nodes in which one node is master and 2 data nodes. Assume I have about 1GB of data to be worked with the cluster. I need to know

  • how much shards should each node contain

  • how much Ram and CPU should be allocated to each node

  • how to allocate maximum storage for a node

I build ELK Stack on Ubuntu platform.

system 1 properties

  • 12GB RAM

  • 500 GB HDD

system 2 properties

  • 8GB RAM

  • 500 GB HDD

system 3 properties

  • 4GB RAM

  • 500 GB HDD

I made number of shards as 9 since there are 3 nodes ?( 3*3=9).Using Rest API.

curl -X POST "http://localhost:9200/_template/default" -H 'Content-Type: application/json' -d'
{
  "index_patterns": ["*"],
  "order": -1,
  "settings": {
    "number_of_shards": "9",
    "number_of_replicas": "1"
  }
}
'

I don't know whether it is right or wrong.

I needed to make a healthy cluster.

Is there any method or parameters for assigning shards, replica, RAM, Disk space etc.?

Is there any method to find the ideal number of shards depending up on the file size?

How much CPU Core must be allocated to each nodes?

I referred the following links to build ELK cluster so far.

1
Short answer: it depends. Shard allocation discussed here: elastic.co/blog/… - Adam T
@ADARSHK it really depends on what you want to do with elastic and your data, the only way is to test it, but you should start small, for example, with 9 shards and 1 replica, you will have 18 shards, there is no need to start this way if you only have 2 data nodes, you also should try to have your data nodes with the same cpu, memory, disk configuration. Start small, test with our data, grow as needed. - leandrojmp
@ADARSHK If you need to work with only 1 GB of data you don't even need a multi-node cluster, you can have a small single node cluster, one machine with 2 cores and 4 GB of RAM using 2 GB for the Java HEAP is more than enough for it, also, you can use only one shard, and you don't need replicas, since you will be working with only one node. - leandrojmp
@JerrinThomas The only way is to test and grown when or if needed, it really depends on your data, there is no one size fits all with elasticsearch. The link in the first comments has good tips about sharding that you should try to follow. - leandrojmp

1 Answers

2
votes

Shards details

Generally, we recommend that if you don’t expect data to grow significantly, then:

One primary shard is fine if you have less than 100K documents

One primary shard per node is good if you have over 100K documents

One primary shard per CPU core is good if you have at least a couple million documents

An index can have many shards, but any given shard can only belong to one index.

Ref: https://docs.bonsai.io/article/122-shard-primer