2
votes

I am using Spark Streaming to aggregate HTTP requests into HTTP sessions and indexing the sessions into Elasticsearch in an upsert mode based on session id. Each session contains a robotic score computed and updated in real time. I want to propagate the robotic score to all the HTTP requests that belong to the same session. I there a way to perform such update on already indexed HTTP requests in real-time ?

1

1 Answers

1
votes

ElasticSearch doesn't (currently) support UPDATE WHERE type queries.

You will have to do this in 2 steps.

  1. Perform a query to get all documents with a particular session id
  2. Update each document with the score using a partial update See https://www.elastic.co/guide/en/elasticsearch/guide/current/partial-updates.html for more details, but to paraphrase, something like

POST /sessions/1/_update { "doc" : { "score": 22 } }

Where the 1 in the URL is the document id you want to update. The _update operation will keep any existing fields and just update the score (though not that _update is not strictly speaking true, since it will create a new document with the current field values and delete the old one, but for your case that is irrelevant semantics).