1
votes

I have 3 servers running with each Solr 5.3 and Zookeeper (solr-cloud-01/zookeeper-01, solr-cloud-02/zookeeper-02 & solr-cloud-03/zookeeper-03)

Zookeeper is up and running and one of the servers is a leader, others are follower

# zkServer.sh status 

If I try to create a solr collection, the config is created correctly in Zookeeper, but the core itself will not create, but timeout after 180s

# solr create_collection -c [collection_name] -d [config_name]

Connecting to ZooKeeper at zookeeper-01:2181,zookeeper-02:2181,zookeeper-03:2181 ...    
Uploading /opt/solr/server/solr/configsets/[config_name]/conf for config 
[collection_name] to ZooKeeper at zookeeper-01:2181,zookeeper-02:2181,zookeeper-03:2181

(or)

Re-using existing configuration directory [collection_name]

next:

Creating new collection '[collection_name]' using command:
http://localhost:8983/solr/admin/collections?action=CREATE&name=
[collection_name]&numShards=1&replicationFactor=1&maxShardsPerNode=1&
collection.configName=[collection_name]

ERROR: Failed to create collection '[collection_name]' due to: 
create the collection time out:180s

The solr admin console log shows 2 identical error messages, one from SolrCore, the other from SolrDispatchFilter

null:org.apache.solr.common.SolrException: create the collection time out:180s
    at org.apache.solr.handler.admin.CollectionsHandler.handleResponse(CollectionsHandler.java:239)
    at org.apache.solr.handler.admin.CollectionsHandler.handleRequestBody(CollectionsHandler.java:170)
    at org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:143)
    at org.apache.solr.servlet.HttpSolrCall.handleAdminRequest(HttpSolrCall.java:675)
    at org.apache.solr.servlet.HttpSolrCall.call(HttpSolrCall.java:443)
    at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:214)
    at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:179)
    at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1652)
    at org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:585)
    at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:143)
    at org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:577)
    at org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:223)
    at org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1127)
    at org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:515)
    at org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:185)
    at org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1061)
    at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141)
    at org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:215)
    at org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:110)
    at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:97)
    at org.eclipse.jetty.server.Server.handle(Server.java:499)
    at org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:310)
    at org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:257)
    at org.eclipse.jetty.io.AbstractConnection$2.run(AbstractConnection.java:540)
    at org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:635)
    at org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.java:555)
    at java.lang.Thread.run(Thread.java:745)

If I then edit /opt/zookeeper/conf/zoo.cfg and uncomment the other zookeepers (reducing the quorum to 1 server)

server.1=zookeeper-01:2888:3888
#server.2=zookeeper-02:2888:3888
#server.3=zookeeper-03:2888:3888

And change the ZK_HOSTS option in /var/solr/solr.in.sh

#ZK_HOST="zookeeper-01:2181,zookeeper-02:2181,zookeeper-03:2181"
ZK_HOST="zookeeper-01:2181"

And restart both zookeeper and solr => The core is created (it was queued somehow?). But offline becausethe quorum was down (1 of 3 zookeeper nodes)


So then I experimented with a standalone solr / zookeeper setup (solr-cloud-01 / zookeeper-01)

# zkServer.sh status
JMX enabled by default
Using config: /opt/zookeeper/bin/../conf/zoo.cfg
Mode: standalone

# zkServer.sh status
JMX enabled by default
Using config: /opt/zookeeper/bin/../conf/zoo.cfg
Mode: standalone

I executed the same command:

# solr create_collection -c [collection_name] -d [config_name]

Connecting to ZooKeeper at zookeeper-01:2181 ...
Uploading /opt/solr/server/solr/configsets/[config_name]/conf for config [collection_name] 
to ZooKeeper at zookeeper-01:2181

Creating new collection '[collection_name]' using command:
http://localhost:8983/solr/admin/collections?action=CREATE
&name=[collection_name]&numShards=1&replicationFactor=1&
maxShardsPerNode=1&collection.configName=[collection_name]

{
  "responseHeader":{
    "status":0,
    "QTime":9417},
  "success":{"":{
      "responseHeader":{
        "status":0,
        "QTime":8869},
      "core":"[collection_name]_shard1_replica1"}}}

So that works!


In conclusion, I have the feeling that some routes are not correctly configured, but I can't seem to find out which... Because Zookeeper seems to work and all individual solr instances as well

Here my hosts file:

127.0.0.1 localhost
10.0.0.1 solr-cloud-01  
10.0.0.2 solr-cloud-02
10.0.0.3 solr-cloud-03
10.0.0.1 zookeeper-01
10.0.0.2 zookeeper-02
10.0.0.3 zookeeper-03
1

1 Answers

1
votes

So, I finally found the answer!

After inspecting the /clusterstate.json via the zkCli.sh I saw that when disconnected 3 'rogue' replica's were mad to the standalone cluster. All pointing to 127.0.1.1, (which is a debian specific loopback to localhost, see https://www.debian.org/doc/manuals/debian-reference/ch05.en.html#_the_hostname_resolution)

The clue was in my hosts file.

So when I changed all reference to hostnames from 127.0.1.1 to the outside IP (in my case 10.0.0.x) it started working!

My new hosts file:

127.0.0.1 localhost
10.0.0.1 solr-cloud-01
10.0.0.2 solr-cloud-02
10.0.0.3 solr-cloud-03
10.0.0.1 zookeeper-01
10.0.0.2 zookeeper-02
10.0.0.3 zookeeper-03