1
votes

I have a log file stored in Elasticsearch, where a document is a single line of the file. Blocks of messages start and end with certain keywords. I want to get all documents between documents containing these keywords. Is there a way to leverage the range query/range filter in Elasticsearch to query for text fields?

Sample log file:
...
...
xyz foo "keyword1" .....
..
....
...
xyz bar "keyword2" .....
..
..

I'd like to query all documents between "keyword1" and "keyword2", including the documents containing the keywords themselves. Assume that there are multiple such blocks with "keyword1" and "keyword2".

Additionally, I'm updating the documents containing these keywords with a new field test_field, which contains these keywords as values. Can this new field be used in range filters to achieve the above task?

Elasticsearch fields: _source: { "log_line", "test_field" }

1

1 Answers

0
votes

I assume you also have some identifier that defines the order of those documents. Let's say, you have a field line_number.

You could do first two searches, matching all the documents containing the keywords. Then for each pair of those keywords you have starting and ending line number. For each pair you can search for for all documents between the two line numbers (using range query). This wouldn't be a pure ES solution and requires some scripting in e.g. python or any other language. Let me know, if you need help with the queries.

But before doing something like this, if I were you, I would critically question this requirement. Why reading the logfile line for line into ES? Why not using Logstash/Filebeat to load the data with your preferred pattern, so you have one document with the whole block? Makes queries and analysis so much easier :)