1
votes

I am using RoR to develop an application and a gem called searchkick, this gem internally uses elasticsearch. Everything works fine but on the production, we faced a weird issue, that after some time the site goes down. The reason we discovered was the memory on the server was being overused. We deleted some elasticsearch log files of the previous week and found out that the memory use was reduced to 47% from 92%. we use rolled logging, and logs are backed up each day. Now, the problem that we are facing is, with only 1 log file of the previous day, the memory grows higher. The log files are taking up a lot of space, even the current one takes 4GB!!!! How can I prevent that?

The messages are almost are warn level.

[00:14:11,744][WARN ][cluster.action.shard ] [Abdul Alhazred] [?][0] sending failed shard for [?][0], node[V52W2IH5R3SwhZ0mTFjodg], [P], s[INITIALIZING], indexUUID [4fhSWoV8RbGLj5jo8PVoxQ], reason [Failed to start shard, message [IndexShardGatewayRecoveryException[[?][0] failed recovery]; nested: EngineCreationFailureException[[?][0] failed to create engine]; nested: LockReleaseFailedException[Cannot forcefully unlock a NativeFSLock which is held by another indexer component: /usr/lib64/elasticsearch-1.1.0/data/elasticsearch/nodes/0/indices/?/0/index/write.lock]; ]]

Looking at some of the SO questions, I'm trying to increase the ulimit or create a new node, so that the problem is also solved and size reduces. My limits.conf has 65535 for hard and soft nofile. Also in sysctl.conf fs.file-max more that 100000. Is there any other step that I could take to reduce the file size, moreover I'm not able to get insight into elasticsearch config changes.

If anyone could help. thanks

2
I suggest an upgrade to at least 1.2.4, because of some file locking issues reported in Lucene: issues.apache.org/jira/browse/LUCENE-5612, issues.apache.org/jira/browse/LUCENE-5544Andrei Stefan
@AndreiStefan: Upgrading seems to solve the problem.. though upgrading itself involved a whole lot squeezing of brain. Put this as answer so that I can mark,inquisitive

2 Answers

1
votes

I suggest an upgrade to at least 1.2.4, because of some file locking issues reported in Lucene: http://issues.apache.org/jira/browse/LUCENE-5612, http://issues.apache.org/jira/browse/LUCENE-5544.

0
votes

Yes ElasticSearch and Lucene are both resource intensive. I did the following to rectify my system:

  1. Stop ElasticSearch. if you start from command like (bin/elasticsearch) then please specific this to set up heap while starting. For ex, I use a 16GB box so my command is

a. bin/elasticsearch -Xmx8g -Xms8g

b. Go to config (elasticsearch/config/elasticsearch.yml) and ensure that

bootstrap.mlockall: true

c. Increase ulimits -Hn and ulimits -Sn to more than 200000

  1. If you start as a service, then do the following

a. export ES_HEAP_SIZE=10g

b. Go to config (/etc/elasticsearch/elasticsearch.yml) and ensure that

bootstrap.mlockall: true

c. Increase ulimits -Hn and ulimits -Sn to more than 200000

Make sure that the size you enter is not more than 50% of the heap whether you start it as a service or from command line