4
votes

Breaking my head on modeling productvariants using ES (or Solr for that matter)

Consider (contrived example):

  • different products (say T-shirts)
  • each product has a set of properties (productid, name, desc, brand, color, popularity)
  • each product has a set of productvariants with properties (productvariantid (combi of productid++size), productid, size, availability, price)

This seems to be a standard parent/child relationship between product and productvariant. So I'd like to model it like that in ES.

I'd like to be able to do the following:

  • A. Query for productvariants (and return all properties). No need to return product-properties, productvariant properties are enough.

  • B. Each user-query is constrained so that at most 1 productvariant matches per product (in the above example that means we constrain on productvariant.size)

  • C. filter on price.

  • D. filter on some properties of product

  • E. order on price

  • F. order on property of product such as popularity, or a combination of the 2.

  • G. facet on productvariant.price

  • H. facet on multiple properties of product (the parent)

Doing this with parent/child documents and has_parent in ES: A-E + G are possible.

However, how about F and H? I've looked into things as _scope for facets (although admittedly I don't grok the possibilities 100%) and all other stuff that comes to mind, but I don't see an obvious solution to show facets for product-properties and be able to sort by them in conjunction with has_parent .

I've tried other things (on paper) - has_child -> no luck need variant info returned - embedded docs (variant inside product) and return entire product with all variants. It just feels clunky. Moreover I'm pretty sure I can't facet/order on price that way.

Help much appreciated

1
note to self: probably possible when this is implemented: github.com/elasticsearch/elasticsearch/issues/1383Geert-Jan

1 Answers

5
votes

I banged my head on the wall for a long time trying to get a similar scheme working. My scheme was a Product/Vendor relationship (single product sold by multiple vendors, potentially different descriptions/prices/availability).

Parent->Child mapping in ES just isn't very robust or easy to use right now. Even if you get something working, you'll quickly run into edge-cases which are literally impossible because ES doesn't support it.

I think your best bet is to manage the parent->child mapping on your own and store the documents in their own index. Products have an ID which is then stored in the ProductVariant documents as Product_ID. This is actually how ES stores parent->child relationships internally anyway.

In practice, you query your "top level" index (Products), then perform a second query on your ProductVariant's index with a filter on the Product_ID field.

It's a little more hassle to maintain but sooo much more flexible. At least until ES get's better Parent->Child capabilities