3
votes

I'm executing a search query on an ElasticSearch index that is giving me strange results. I want to find all documents where the product.id = 209349:

{
  "index": "products",
  "from": 0,
  "size": 100,
  "body": {
    "query": {
      "filtered": {
        "filter": [
          {
            "term": {
              "product.id": 209349
            }
          }
        ]
      }
    }
  }
}

However, the results are returning me a document where product.id = 83875. What I noticed is the product.variant.id equals 209349... What's going on here?

{
  "took": 1,
  "timed_out": false,
  "_shards": {
    "total": 1,
    "successful": 1,
    "failed": 0
  },
  "hits": {
    "total": 1,
    "max_score": 1,
    "hits": [
      {
        "_index": "products",
        "_type": "product",
        "_id": "UPC-83875",
        "_score": 1,
        "_source": {
          "mpn": "UPC-83875",
          "product_count": 1,
          "price": "448.00",
          "price_discount_amount": null,
          "product": [
            {
              "id": 83875,
              "posted_on": "2014-07-23 22:08:36",
              "status_id": 3,
              "sku": "23469984",
              "mpn": "UPC-83875",
              "name": "Laser Toner Cartridge Set Black Cyan Yellow Magenta",
              "description": "",
              "has_image": true,
              "currency_id": 1,
              "price": "448.00",
              "variant": [
                {
                  "id": 209349,
                  "sku": "23469984",
                  "name": null,
                  "price": "448.00",
                  "discount_amount": null,
                  "price_total": "448.00",
                  "has_image": false
                }
              ]
            }
          ]
        }
      }
    ]
  }
}

Here's the schema:

{
  "dynamic": "strict",
  "properties": {
    "mpn": {
      "type": "string",
      "index": "not_analyzed"
    },
    "price": {
      "type": "double"
    },
    "price_discount_amount": {
      "type": "double"
    },
    "product": {
      "properties": {
        "id": {
          "type": "long"
        },
        "posted_on": {
          "type": "date",
          "format": "yyyy-MM-dd HH:mm:ss"
        },
        "status_id": {
          "type": "long"
        },
        "sku": {
          "type": "string",
          "index": "not_analyzed"
        },
        "mpn": {
          "type": "string",
          "index": "not_analyzed"
        },
        "name": {
          "type": "string",
          "analyzer": "english"
        },
        "description": {
          "type": "string",
          "analyzer": "english"
        },
        "has_image": {
          "type": "boolean"
        },
        "price": {
          "type": "double"
        },
        "price_discount_amount": {
          "type": "double"
        },
        "currency_id": {
          "type": "long"
        },
        "variant": {
          "properties": {
            "id": {
              "type": "long"
            },
            "sku": {
              "type": "string",
              "index": "not_analyzed"
            },
            "name": {
              "type": "string",
              "analyzer": "english"
            },
            "discount_amount": {
              "type": "double"
            },
            "price": {
              "type": "double"
            },
            "price_total": {
              "type": "double"
            },
            "has_image": {
              "type": "boolean"
            }
          }
        }
      }
    }
  }
}

Edit/Solved: It appears the issue was the fact that my index _type is named product as well as the inner object. So what was happening was ES matched <_type>.id (kinda like doing *.id) and it would match the variant.id because it was an id field.

The correct name to use is apparently product.product.id which ends up being a full path.

This behavior is apparently to appease some users long ago: https://github.com/elastic/elasticsearch/issues/3005

1
can you post the mapping for product field ?keety
Which version of elasticsearch you are using? There used to be an issue regarding field resolution. Check out github.com/elastic/elasticsearch/issues/4081Prabin Meitei
I'm using version 1.4.4 - Someone just pointed this out to me: elastic.co/guide/en/elasticsearch/reference/1.3/… I'm reading about the way field resolution works, but this still would not explain why "product.id" would be confused with "product.variant.id". :/Ian

1 Answers

1
votes

I'm not sure why you're getting that result (it wasn't returned when I tried it), but you might want to try a mapping like this, using nested types:

PUT /test_index
{
   "mappings": {
      "doc": {
         "properties": {
            "mpn": {
               "type": "string"
            },
            "price": {
               "type": "string"
            },
            "product_count": {
               "type": "long"
            },
            "product": {
               "type": "nested",
               "properties": {
                  "currency_id": {
                     "type": "long"
                  },
                  "description": {
                     "type": "string"
                  },
                  "has_image": {
                     "type": "boolean"
                  },
                  "id": {
                     "type": "long"
                  },
                  "mpn": {
                     "type": "string"
                  },
                  "name": {
                     "type": "string"
                  },
                  "posted_on": {
                     "type": "string"
                  },
                  "price": {
                     "type": "string"
                  },
                  "sku": {
                     "type": "string"
                  },
                  "status_id": {
                     "type": "long"
                  },
                  "variant": {
                     "type": "nested",
                     "properties": {
                        "has_image": {
                           "type": "boolean"
                        },
                        "id": {
                           "type": "long"
                        },
                        "price": {
                           "type": "string"
                        },
                        "price_total": {
                           "type": "string"
                        },
                        "sku": {
                           "type": "string"
                        }
                     }
                  }
               }
            }
         }
      }
   }
}

When I created an index using this mapping, and indexed your document:

PUT /test_index/doc/UPC-83875
{
   "mpn": "UPC-83875",
   "product_count": 1,
   "price": "448.00",
   "price_discount_amount": null,
   "product": [
      {
         "id": 83875,
         "posted_on": "2014-07-23 22:08:36",
         "status_id": 3,
         "sku": "23469984",
         "mpn": "UPC-83875",
         "name": "Laser Toner Cartridge Set Black Cyan Yellow Magenta",
         "description": "",
         "has_image": true,
         "currency_id": 1,
         "price": "448.00",
         "variant": [
            {
               "id": 209349,
               "sku": "23469984",
               "name": null,
               "price": "448.00",
               "discount_amount": null,
               "price_total": "448.00",
               "has_image": false
            }
         ]
      }
   ]
}

Then used a nested filter in my query, it wasn't returned:

POST /test_index/_search
{
   "query": {
      "filtered": {
         "query": {
            "match_all": {}
         },
         "filter": {
            "nested": {
               "path": "product",
               "filter": {
                  "term": {
                     "product.id": 209349
                  }
               }
            }
         }
      }
   }
}
...
{
   "took": 1,
   "timed_out": false,
   "_shards": {
      "total": 1,
      "successful": 1,
      "failed": 0
   },
   "hits": {
      "total": 0,
      "max_score": null,
      "hits": []
   }
}

However, it was returned by both this query:

POST /test_index/_search
{
   "query": {
      "filtered": {
         "query": {
            "match_all": {}
         },
         "filter": {
            "nested": {
               "path": "product",
               "filter": {
                  "term": {
                     "product.id": 83875
                  }
               }
            }
         }
      }
   }
}

and this one:

POST /test_index/_search
{
   "query": {
      "filtered": {
         "query": {
            "match_all": {}
         },
         "filter": {
            "nested": {
               "path": "product.variant",
               "filter": {
                  "term": {
                     "product.variant.id": 209349
                  }
               }
            }
         }
      }
   }
}

Here is some code I used for testing:

http://sense.qbox.io/gist/8f5fa6f2ced088a42b92f495c0668024b9ef19c8