5
votes

I am looking to set up an ELK stack and have three servers to do so. While I have found plenty of documentation and tutorials about how to actually install, and configure elasticsearch, logstash, and kibana, I have found less information about how I should set up the software across my servers to maximize performance. For example, would it be better to set up elasticsearch, logstash, and kibana on all three instances, or perhaps install elasticsearch on two instances and logstash and kibana on the third?

Related to that question, if i have multiple elasticsearch servers in my cluster, will I need a load balancer to spread requests to them, or can I send the data to one server, and it will distribute it accordingly?

1

1 Answers

10
votes

The size of your machines would also be important. Three machines with 8GB of RAM is much different than three with 64GB or more...

Kibana takes very few resources. Logstash is more CPU-heavy. Elasticsearch is more RAM heavy.

With an elasticsearch cluster, you usually want a replica of each shard for redundancy. That's usually done with two servers. If you have a third elasticsearch server, then you'll get an IO boost (writing two copies of the data to three servers lowers the load). Also, an even number of servers can get confused as to which is the master, so three will help prevent "split brain" problems.

Those two or three nodes would be "data" nodes, so if you throw queries or indexing requests at them, they may need to move the request to a different server (the one with the data, etc). A request also has a "reduce" phase, where the data from each node is combined before being returned. Having a smaller "client" node - where queries and index requests go - helps with that. Of course, you'd want two, to make them redundant.

Logstash is best run multithreaded, so having multiple cpus that you can dedicate is nice. Having a redundant/load-balanced logstash machine is also nice. Kibana could run on these machines as well.

So, we're quickly up to 7 machines. Not what you wanted to hear, right?

If you're firmly limited to 3 machines, you'd want to run elasticsearch on all three as mentioned above. You need to shoehorn in the rest.

Logstash on two, kibana on one? Then you have a single point of failure for kibana.

How about logstash on all three and kibana on all three? The load would be distributed around, so hopefully would be a small increment for each server. And, if the machines are beefy enough, it should be OK.

I have machines in one cluster that run logstash,

The general recommendation is to allocate 1/2 the system RAM (up to ~31GB) to elasticsearch, leaving the rest to the operating system. If you were going to run logstash and kibana on the same machines, you'd want to lower that (to maybe 40%?), give logstash some (15%?) and leave the rest to the OS.

Clearly, the size of your machines is important here.