How can I find a number range inside a piece of text using ElasticSearch

Question

I've been reading through the internet trying to work out how I find a number range in a piece of text using Elastic Search. However I have had no luck.

Here's an example, say I have the following set of documents (note that the document is NOT split into multiple fields it's just a block of text).

doc1{ msg:"I have 7 books" }

doc2{ msg:"I have 15 books" }

doc3{ msg:"I have 19 books" }

Is it possible to form a query using ElasticSearch to find all the people who own between 10 and 20 books?

Thanks Rich

Sloan Ahrens Sloan Ahrens · Accepted Answer · 2015-04-06T17:58:30

In ES 1.5, the keep_types token filter is designed for this sort of thing, apparently. I set it up (with ES 1.5) in this code, and it seems to work:

http://sense.qbox.io/gist/b2c86b748d0c33957df1dcb90a3b405b0a4ca646

However, I didn't actually need that to get it to work. The standard analyzer divides the text into tokens based on whitespace, and so you can then apply the range query (a filter works too) against the field and it seems to do what you wanted:

I set up a simple index:

DELETE /test_index

PUT /test_index

POST /test_index/doc/_bulk
{ "index": { "_id": 1 }}
{ "msg": "I have 7 books" }
{ "index": { "_id": 2 }}
{ "msg": "I have 15 books" }
{ "index": { "_id": 3 }}
{ "msg": "I have 19 books" }

Then used a range query:

POST /test_index/_search
{
    "query": {
        "range": {
           "msg": {
              "from": 10,
              "to": 20
           }
        }
    }
}
...
{
   "took": 3,
   "timed_out": false,
   "_shards": {
      "total": 5,
      "successful": 5,
      "failed": 0
   },
   "hits": {
      "total": 2,
      "max_score": 1,
      "hits": [
         {
            "_index": "test_index",
            "_type": "doc",
            "_id": "2",
            "_score": 1,
            "_source": {
               "msg": "I have 15 books"
            }
         },
         {
            "_index": "test_index",
            "_type": "doc",
            "_id": "3",
            "_score": 1,
            "_source": {
               "msg": "I have 19 books"
            }
         }
      ]
   }
}

Here is the code for the second example:

http://sense.qbox.io/gist/0979803673efb5b7ff063c257efd82617a93bd06

How can I find a number range inside a piece of text using ElasticSearch

1 Answers