1
votes

We are using solrcloud 6.3. We have a collection say "MyCore". It is divided in two shards and two replicas. Each shard is located on single server.

Specifications: Index size 4 GB, heap size 2 GB, Total Ram on machine: 40 GB, Number of CPUs: 32.

Our indexing job runs once in a minute and We using using Zookeeper to add documents to solr. Here is the commit configuration from solrconfig.xml

<updateHandler class="solr.DirectUpdateHandler2">
<maxPendingDeletes>100000</maxPendingDeletes>
<updateLog>
     <int name="numRecordsToKeep">200</int>
     <int name="maxNumLogsToKeep">5</int>
</updateLog>
<autoCommit>
    <maxDocs>500</maxDocs>
    <maxTime>120000</maxTime>
    <openSearcher>false</openSearcher>
 </autoCommit>
 <autoSoftCommit>
     <maxDocs>300</maxDocs>
     <maxTime>60000</maxTime>
  </autoSoftCommit>

We are seeing read timed out exceptions in solr logs and at that time solr becomes unresponsive for some time like 30 seconds and after 30 seconds it comes up automatically without taking a restart. Just want to mention here that we have analysed GC logs and nothing unsual is found. GC activity is healthy. Attaching logs.

2017-11-06 07:05:00.121 ERROR (updateExecutor-2-thread-624-processing-n:192.168.0.1:8983_solr x:MyCore_shard2_replica1 s:shard2 c:MyCore r:core_node3) [c:MyCore s:shard2 r:core_node3 x:MyCore_shard2_replica1] o.a.s.u.SolrCmdDistributor org.apache.solr.client.solrj.SolrServerException: Timeout occured while waiting response from server at: http://192.168.0.4:8983/solr/MyCore_shard1_replica3
    at org.apache.solr.client.solrj.impl.HttpSolrClient.executeMethod(HttpSolrClient.java:604)
    at org.apache.solr.client.solrj.impl.HttpSolrClient.request(HttpSolrClient.java:262)
    at org.apache.solr.client.solrj.impl.HttpSolrClient.request(HttpSolrClient.java:251)
    at org.apache.solr.client.solrj.impl.ConcurrentUpdateSolrClient.request(ConcurrentUpdateSolrClient.java:420)
    at org.apache.solr.client.solrj.SolrClient.request(SolrClient.java:1219)
    at org.apache.solr.update.SolrCmdDistributor.doRequest(SolrCmdDistributor.java:293)
    at org.apache.solr.update.SolrCmdDistributor.lambda$submit$0(SolrCmdDistributor.java:282)
    at org.apache.solr.update.SolrCmdDistributor$$Lambda$119/389184320.call(Unknown Source)
    at java.util.concurrent.FutureTask.run(FutureTask.java:266)
    at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
    at java.util.concurrent.FutureTask.run(FutureTask.java:266)
    at org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor.lambda$execute$0(ExecutorUtil.java:229)
    at org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor$$Lambda$8/534303375.run(Unknown Source)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
    at java.lang.Thread.run(Thread.java:744)
Caused by: java.net.SocketTimeoutException: Read timed out
    at java.net.SocketInputStream.socketRead0(Native Method)
    at java.net.SocketInputStream.read(SocketInputStream.java:150)
    at java.net.SocketInputStream.read(SocketInputStream.java:121)
    at org.apache.http.impl.io.AbstractSessionInputBuffer.fillBuffer(AbstractSessionInputBuffer.java:160)
    at org.apache.http.impl.io.SocketInputBuffer.fillBuffer(SocketInputBuffer.java:84)
    at org.apache.http.impl.io.AbstractSessionInputBuffer.readLine(AbstractSessionInputBuffer.java:273)
    at org.apache.http.impl.conn.DefaultHttpResponseParser.parseHead(DefaultHttpResponseParser.java:140)
    at org.apache.http.impl.conn.DefaultHttpResponseParser.parseHead(DefaultHttpResponseParser.java:57)
    at org.apache.http.impl.io.AbstractMessageParser.parse(AbstractMessageParser.java:261)
    at org.apache.http.impl.AbstractHttpClientConnection.receiveResponseHeader(AbstractHttpClientConnection.java:283)
    at org.apache.http.impl.conn.DefaultClientConnection.receiveResponseHeader(DefaultClientConnection.java:251)
    at org.apache.http.impl.conn.ManagedClientConnectionImpl.receiveResponseHeader(ManagedClientConnectionImpl.java:197)
    at org.apache.http.protocol.HttpRequestExecutor.doReceiveResponse(HttpRequestExecutor.java:272)
    at org.apache.http.protocol.HttpRequestExecutor.execute(HttpRequestExecutor.java:124)
    at org.apache.http.impl.client.DefaultRequestDirector.tryExecute(DefaultRequestDirector.java:685)
    at org.apache.http.impl.client.DefaultRequestDirector.execute(DefaultRequestDirector.java:487)
    at org.apache.http.impl.client.AbstractHttpClient.doExecute(AbstractHttpClient.java:882)
    at org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:82)
    at org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:55)
    at org.apache.solr.client.solrj.impl.HttpSolrClient.executeMethod(HttpSolrClient.java:498)
    ... 15 more
1

1 Answers

0
votes

How often are these timeouts happening? And when they do, is it exactly 30 seconds that it's timed out?

This may have something to do with your soft/hard commit times being so long. Solr shouldn't be waiting the full time to commit but sometimes it does even when it's not under load.

Depending on how quickly you're indexing data into Zk, I'd try something more like 30s soft commit and 60s hard commit. If your indexing load is very high maybe increase yours to higher times.

Here is an interesting document I looked at when I first set up my soft/hard commit times and I've had no issues. The Recommendation section references multiple loads/setups.

Good Luck!