34
votes

I am currently using elasticsearch 0.9.19. The machine I am using is having around 300GB disk space and the RAM on it is around 23GB. I have allocated around 10GB of ram to elastic search. My operations are write intensive. They are around 1000docs/s. I am only running elastic search on the machine and no other process. The doc size is not large. They are small only with not more than 10 fields. The elastic search is being run only on one machine with 1 shard and 0 replicas.

The memory used, starts increasing very rapidly when I am sending 1000 docs/s. Though I have allocated 10GB RAM only to elastic search but still almost 21 GB ram gets consumed and eventually the elastic search process results in out of heap space. Later I need to clear the OS cache to free all the memory. Even when I stop sending elastic search, 1000docs/s then also the memory does not get automatically cleared.

So For e.g If I am running elastic search with around 1000doc/s write operations then, I found that it went to 18 GB Ram usage very quickly and later when I reduced my write operations to only 10 docs/s then also the memory used still shows around 18 GB. Which I think should come down with decrease in the number of write operations. I am using Bulk API for performing my write operations with size of 100 docs per query. The data is coming from 4 machines when the write operations are around 1000docs/sec

These are the figures which I am getting after doing top

Mem: 24731664k total, 18252700k used, 6478964k free, 322492k buffers

Swap: 4194296k total, 0k used, 4194296k free, 8749780k cached

PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND

1004 elastics 20 0 10.7g 8.3g 10m S 1 35.3 806:28.69 java

Please tell if any one has any idea, what could be the reason for this. I have to stop my application because of this issue. I think I am missing any configuration. I have already read all the cache related documentations for the elastic search over here http://www.elasticsearch.org/guide/reference/index-modules/cache.html

I have also tried clearing cache using clear cache API and also tried flush api. But didnot got any improvement.

Thanks in advance.

2
Why was this question closed? I have exactly the same issue - anyone know if this got reposted somewhere?james lewis
OK I found it - to anyone else who gets here: elasticsearch-users.115913.n3.nabble.com/…james lewis
This is a good question. Thanks for the pointer to an answerIan Lewis
This is a good question, why closed? yet I found following link useful. blog.sematext.com/2012/05/17/elasticsearch-cache-usageshyos
Excellent excellent question. Too bad the mods think they were smart and closed it.. instead of 10000 random other questions about Rails.Henley

2 Answers

2
votes

To summarize the answer on the mailing list thread: the problem was that the Ruby client wasn't able to throttle its inserts, and Lucene memory usage does grow as large numbers of documents are added. I think there may also be an issue with commit frequency: it's important to commit from time to time in order to flush newly added documents to disk. Is the OP still having the problem? If not, could you post the solution?

2
votes

I think that your ingesting is to heavy for the cluster capacity. Then data keeps stacked in memory. You should monitor your disk I/O, it should be the bottleneck.

You should then :

  • slower the ingestion (you could maybe use a stronger queue like Kafka, rabbit MQ etc..., or use the persisted queue system of logstash)
  • use quick SSD hard drive to speed up IO capacity
  • add more nodes (and adjust shards of your indices) for a better I/O parallelism

As small optimization, you can improve performance a little by :

  • increasing the refresh_interval. This action consume RAM, so avoid refreshing when you're in heavy ingesting node could help a lot
  • if you are doing a first ingestion in your index, try to remove all replicas in ingesting phase, and re add the replicas after ingestion