Elk on Docker Swarm and glusterFS crash

Question

I'm trying to deploy an ELK stack on docker swarm.

If I bind the elastic data directory to a Docker volume there is no problem.

The problems comes as soon as I try to bind the elstastic data directory to a glusterFS volume. I use glusterFS to synchronise the data between all the swarm nodes in the cluster. I deploy ELK using the following code:

elasticsearch:
    image: docker.elastic.co/elasticsearch/elasticsearch:6.2.3
    # container_name: elasticsearch
    environment: 
      - "http.host=0.0.0.0"
      - "transport.host=127.0.0.1"
      - "ELASTIC_PASSWORD=changeme"
      - "TAKE_FILE_OWNERSHIP=1"
    ports: ['127.0.0.1:9200:9200']
    volumes:
      - /opt/dockershared/stack-elk/elk:/usr/share/elasticsearch/data
    networks: ['stack']

The dir '/opt/dockershared/' is a glusterFS volume:

myhost:/gvol0 on /opt/dockershared type fuse.glusterfs (rw,relatime,user_id=0,group_id=0,default_permissions,allow_other,max_read=131072,_netdev)

The ELK stack starts without problems, but after 30/60 minutes the allocation of the shards fails. In the ELK logs I see the following exceptions:

[2018-04-13T08:58:16,749][WARN ][o.e.i.e.Engine ] [MPxFOvC] [metricbeat-6.2.3-2018.04.13][0] failed engine [refresh failed source[schedule]] org.apache.lucene.index.CorruptIndexException: Problem reading index from store(MMapDirectory@/usr/share/elasticsearch/data/nodes/0/indices/fRcersH4RjecZ8AKb3WZTQ/0/index lockFactory=org.apache.lucene.store.NativeFSLockFactory@73620ce7) (resource=store(MMapDirectory@/usr/share/elasticsearch/data/nodes/0/indices/fRcersH4RjecZ8AKb3WZTQ/0/index lockFactory=org.apache.lucene.store.NativeFSLockFactory@73620ce7)) at org.apache.lucene.index.SegmentCoreReaders.(SegmentCoreReaders.java:140) ~[lucene-core-7.2.1.jar:7.2.1 b2b6438b37073bee1fca40374e85bf91aa457c0b - ubuntu - 2018-01-10 00:48:43] ...... Caused by: java.io.EOFException: read past EOF: MMapIndexInput(path="/usr/share/elasticsearch/data/nodes/0/indices/fRcersH4RjecZ8AKb3WZTQ/0/index/_47.cfe") at org.apache.lucene.store.ByteBufferIndexInput.readByte(ByteBufferIndexInput.java:75) ~[lucene-core-7.2.1.jar:7.2.1 b2b6438b37073bee1fca40374e85bf91aa457c0b - ubuntu - 2018-01-10 00:48:43] ...... Suppressed: org.apache.lucene.index.CorruptIndexException: checksum status indeterminate: remaining=0, please run checkindex for more details (resource=BufferedChecksumIndexInput(MMapIndexInput(path="/usr/share/elasticsearch/data/nodes/0/indices/fRcersH4RjecZ8AKb3WZTQ/0/index/_47.cfe"))) .....

What could be the problem? what is the best solution to share the elastic data dir among all the swarm nodes?

thank you

Fabry Fabry · Accepted Answer · 2018-04-17T15:15:24

I wrote on ELK forum and this is the answer: elk forum

Basically ELK supports only local disk or a block storage.

Elk on Docker Swarm and glusterFS crash

1 Answers