1
votes

I am indexing a large amount of daily data ~160GB per index into elasticsearch. I am facing this case where I need to update almost all the docs in the indices with a small amount of data(~16GB) which is of the format

id1,data1
id1,data2
id2,data1
id2,data2
id2,data3
.
.
.

My update operations start happening at 16000 lines per second and in over 5 minutes it comes down to 1000 lines per second and doesnt go up after that. The update process for this 16GB of data is currently longer than the time it takes for my entire indexing of 160GB to happen

My conf file for the update operation currently looks as follows

output
{
    elasticsearch {
        action => "update"
        doc_as_upsert => true
        hosts => ["host1","host2","host3","host4"]
        index => "logstash-2017-08-1"
        document_id => "%{uniqueid}"
        document_type => "daily"
        retry_on_conflict => 2
        flush_size => 1000
    }

}

The optimizations I have done to speed up indexing in my cluster based on the suggestions here https://www.elastic.co/guide/en/elasticsearch/guide/current/indexing-performance.html are

  1. Setting "indices.store.throttle.type" : "none"
  2. Index "refresh_interval" : "-1"

I am running my cluster on 4 instances of the d2.8xlarge EC2 instances. I have allocated 30GB of heap to each nodes. While the update is happening barely any cpu is used and the load is very less as well.

Despite everything the update is extremely slow. Is there something very obvious that I am missing that is causing this issue? While looking at the threadpool data I find that the number of threads working on bulk operations are constantly high.

Any help on this issue would be really helpful

Thanks in advance

3
what language are you using? sounds like you got a memory leak somewhere. perhaps a file left open?RisingSun
I am facing the same issue. We're you able to solve the slowness?Zaid Amir

3 Answers

2
votes

There are a couple of rule-outs to try here.

Memory Pressure

With 244GB of RAM, this is not terribly likely, but you can still check it out. Find the jstat command in the JDK for your platform, though there are visual tools for some of them. You want to check both your Logstash JVM and the ElasticSearch JVMs.

jstat -gcutil -h7 {PID of JVM} 2s

This will give you a readout of the various memory pools, garbage collection counts, and GC timings for that JVM as it works. It will update every 2 seconds, and print headers every 7 lines. Spending excessive time in the FCT is a sign that you're underallocated for HEAP.

I/O Pressure

The d2.8xlarge is a dense-storage instance, and may not be great for a highly random, small-block workload. If you're on a Unix platform, top will tell you how much time you're spending in IOWAIT state. If it's high, your storage isn't up to the workload you're sending it.

If that's the case, you may want to consider provisioned IOP EBS instances rather than the instance-local stuff. Or, if your stuff will fit, consider an instance in the i3 family of high I/O instances instead.

Logstash version

You don't say what version of Logstash you're using. Being StackOverflow, you're likely to be using 5.2. If that's the case, this isn't a rule-out.

But, if you're using something in the 2.x series, you may want to set the -w flag to 1 at first, and work your way up. Yes, that's single-threading this. But the ElasticSearch output has some concurrency issues in the 2.x series that are largely fixed in the 5.x series.

0
votes

With elasticsearch version 6.0 we had an exactly same issue of slow updates on aws and the culprit was slow I/O. Same data was upserting on a local test stack completely fine but once in cloud on ec2 stack, everything was dying after an initial burst of speedy inserts lasting only for few minutes.

Local test stack was a low-spec server in terms of memory and cpu but contained SSDs.

s3 stack was EBS volumes with default gp2 300 IOPS.

Converting the volumes to type io1 with 3000 IOPS solved the issue and everything got back on track.

0
votes

I am using amazon aws elasticsearch service version 6.0 . I need heavy write/insert from serials of json file to the elasticsearch for 10 billion items . The elasticsearch-py bulk write speed is really slow most of time and occasionally high speed write . i tried all kinds of methods , such as split json file to smaller pieces , multiprocess read json files , parallel_bulk insert into elasticsearch , nothing works . Finally , after I upgraded io1 EBS volume , everything goes smoothly with 10000 write IOPS .