1
votes

I am working with Elasticsearch 5 and Nest 5.

I running an update by query Async that will update a great number of documents. I am using "WaitForCompletion(false)" in order to do this.

The problem that I am facing is that when I use NEST to get the task that was created by the UpdateByQuery operation, the object that NEST returns does not contains the "Failures" collection.

So if I notice that on the task stats I have version conflicts for example, how can I get the ID of the documents with those version conflicts without having access to failures collection.

My UpdateByQuery looks like this:

query
.Index(allIndexesStr)
.Query(q =>
...
 )
.WaitForCompletion(false)
.Script(script => script
.Inline(scriptStr)
.Params(p =>
    ...
    )
)
...

I am getting the task like this:

new ElasticClient(ESConnectionSettings.Instance.Settings).GetTask(taskId);

UPDATE

If I inspect the Elasticsearch response body (plain text) on NEST I noticed that I get the same information that when doing a get on Kibana (GET _tasks/{taskId}). So I do know why NEST is not parsing/mapping this data.

{
  "completed": true,
  "task": {
    "node": "UeShceb_RaqztSOeFnLkvA",
    "id": 44969,
    "type": "transport",
    "action": "indices:data/write/update/byquery",
    "status": {
      "total": 100,
      "updated": 0,
      "created": 0,
      "deleted": 0,
      "batches": 1,
      "version_conflicts": 100,
      "noops": 0,
      "retries": {
        "bulk": 0,
        "search": 0
      },
      "throttled_millis": 0,
      "requests_per_second": -1,
      "throttled_until_millis": 0
    },
    "description": "",
    "start_time_in_millis": 1526329389030,
    "running_time_in_nanos": 27966899,
    "cancellable": true
  },
  "response": {
    "took": 27,
    "timed_out": false,
    "total": 100,
    "updated": 0,
    "created": 0,
    "deleted": 0,
    "batches": 1,
    "version_conflicts": 100,
    "noops": 0,
    "retries": {
      "bulk": 0,
      "search": 0
    },
    "throttled_millis": 0,
    "requests_per_second": -1,
    "throttled_until_millis": 0,
    "failures": [
      {
        "index": "testvisitors-1523737378",
        "type": "esvisitor",
        "id": "3232c447-0f1f-4c00-abf5-d651c26b0c8c",
        "cause": {
          "type": "version_conflict_engine_exception",
          "reason": "[esvisitor][3232c447-0f1f-4c00-abf5-d651c26b0c8c]: version conflict, current version [2] is different than the one provided [1]",
          "index_uuid": "jXhfvWFVRq-NCvZXAyr58A",
          "shard": "1",
          "index": "testvisitors-1523737378"
        },
        "status": 409
      }, ...
1
Why not set wait for complete to true? - sramalingam24
The Get Task API does not return a failures collection: elastic.co/guide/en/elasticsearch/reference/current/tasks.html - Russ Cam
It does return it, same as with Kibana. I updated the question with this info. - Adriano
Are you sure it's not Kibana that's adding that extra information? As @RussCam said, it's not part of the published API - kͩeͣmͮpͥ ͩ
Yes, I checked the NEST body (plain text) on C# and contains this same info. This is a task that is created by an UpdateByQuery, I do not know if all tasks has this info. Also it make sense as once the task finish you need to get those doc that fail ... - Adriano

1 Answers

0
votes

To get Reindex response you should get response for Reindex from your task result like below:

var completedReindexResponse = taskResponse.GetResponse<ReindexOnServerResponse>();

For more information you can check official ES documentation here.