I am using Spark Streaming to aggregate HTTP requests into HTTP sessions and indexing the sessions into Elasticsearch in an upsert mode based on session id. Each session contains a robotic score computed and updated in real time. I want to propagate the robotic score to all the HTTP requests that belong to the same session. I there a way to perform such update on already indexed HTTP requests in real-time ?
1 Answers
1
votes
ElasticSearch doesn't (currently) support UPDATE WHERE type queries.
You will have to do this in 2 steps.
- Perform a query to get all documents with a particular session id
- Update each document with the score using a partial update See https://www.elastic.co/guide/en/elasticsearch/guide/current/partial-updates.html for more details, but to paraphrase, something like
POST /sessions/1/_update
{
"doc" : {
"score": 22
}
}
Where the 1 in the URL is the document id you want to update. The _update operation will keep any existing fields and just update the score (though not that _update is not strictly speaking true, since it will create a new document with the current field values and delete the old one, but for your case that is irrelevant semantics).