3
votes

The following BigQuery SQL will retrieve all other products purchased for a customer (fullVisitorId) who bought a product named "Elephant" and with a category of "Animals".

Now here is the problem. If a the same customer (fullVisitorId) makes two orders within the same date range. Let's say he or she buys a product named "Tiger" in one transaction plus other products then Tiger and those products will show up as a related product too if he or she made a similar order a day before or later the same day buying Elephant because my query is using the fullVisitorId. In other words one fullVisitorId can have 1 or more transactionIDs. One transaction have the same transactionId for every product purchased for the current transaction.

So does anyone know how I can make additional filtering at the end to only show related products with the same transactionId when the primary product Elephant was purchased and ignore all others that with a different transactionID?

SELECT hits.product.productSKU AS other_purchased_products_sku,
       hits.product.v2ProductName AS other_product_name,
       hits.product.v2ProductCategory AS other_prod_category,
       COUNT(hits.product.productSKU) AS quantity,
       hits.transaction.transactionId AS trans_id,
       fullVisitorId
FROM
  ( SELECT fullVisitorId,
           hits.product.productSKU,
           hits.eCommerceAction.action_type,
           hits.product.v2ProductName,
           hits.product.v2ProductCategory,
           hits.transaction.transactionId
   FROM TABLE_DATE_RANGE([XXXXXX.ga_sessions_], TIMESTAMP('2016-04-1'), TIMESTAMP('2016-04-30')))
WHERE fullVisitorId IN
    ( SELECT fullVisitorId
     FROM TABLE_DATE_RANGE([XXXXXX.ga_sessions_], TIMESTAMP('2016-04-1'), TIMESTAMP('2016-04-30'))
     WHERE hits.product.v2ProductCategory CONTAINS 'Animals'
       AND hits.product.v2ProductName = 'Elephant'
       AND hits.eCommerceAction.action_type = '6'
     GROUP BY fullVisitorId)
  AND hits.product.v2ProductCategory CONTAINS 'Animals'
  AND hits.product.v2ProductName IS NOT NULL
  AND hits.product.v2ProductName !='Elephant'
  AND hits.eCommerceAction.action_type = '6'
GROUP BY other_purchased_products_sku,
         other_product_name,
         other_prod_category,
         trans_id,
         fullVisitorId
ORDER BY trans_id DESC;
1

1 Answers

1
votes

Hopefully this is what you need:

SELECT
    a.fv fv,
    a.sku sku,
    a.name name,
    a.category category,
    a.qtd qtd,
    a.tid tid
FROM(
    SELECT
        fullvisitorid fv,
        visitid v,
        hits.product.productsku sku,
        hits.product.v2productname name,
        hits.product.v2productcategory category,
        COUNT(hits.product.productsku) qtd,
        hits.transaction.transactionid tid
    FROM (table_date_range([XXXXXXX.ga_sessions_], timestamp("20160501"), timestamp("20160501"))),
(table_date_range([XXXXXXXX.ga_sessions_intraday_], timestamp("20160501"), timestamp("20160501")))
    WHERE 1 = 1
and hits.ecommerceaction.action_type = '6'
GROUP EACH BY fv, v, sku, name, category, tid
) a
INNER JOIN EACH(
    SELECT
        fullvisitorid fv,
        visitid v,
        hits.product.productsku sku,
        hits.product.v2productname name,
        hits.product.v2productcategory category,
        hits.transaction.transactionid tid
    FROM (table_date_range([XXXXXXXX.ga_sessions_], timestamp("20160501"), timestamp("20160501"))),
(table_date_range([XXXXXXXX.ga_sessions_intraday_], timestamp("20160501"), timestamp("20160501")))
    WHERE 1 = 1
and hits.ecommerceaction.action_type = '6'
and lower(hits.product.v2productcategory) contains "animals"
and lower(hits.product.v2productname) contains "elephant"
    GROUP EACH BY fv, v, sku, name, category, tid
) b
on a.fv = b.fv and a.tid = b.tid

Breaking it down:

First the query group b:

SELECT
    fullvisitorid fv,
    visitid v,
    hits.product.productsku sku,
    hits.product.v2productname name,
    hits.product.v2productcategory category,
    hits.transaction.transactionid tid
FROM (table_date_range([XXXXXXXX.ga_sessions_], timestamp("20160501"), timestamp("20160501"))),
    (table_date_range([XXXXXXXX.ga_sessions_intraday_], timestamp("20160501"), timestamp("20160501")))
WHERE 1 = 1
    and hits.ecommerceaction.action_type = '6'
    and lower(hits.product.v2productcategory) contains "animals"
    and lower(hits.product.v2productname) contains "elephant"
GROUP EACH BY fv, v, sku, name, category, tid

This query first will retrieve the fullvisitorids and transactionids that you want with categories animals and names elephant.

Then in query group a:

SELECT
    fullvisitorid fv,
    visitid v,
    hits.product.productsku sku,
    hits.product.v2productname name,
    hits.product.v2productcategory category,
    COUNT(hits.product.productsku) qtd,
    hits.transaction.transactionid tid
FROM (table_date_range([XXXXXXX.ga_sessions_], timestamp("20160501"), timestamp("20160501"))),
(table_date_range([XXXXXXXX.ga_sessions_intraday_], timestamp("20160501"), timestamp("20160501")))
WHERE 1 = 1
    and hits.ecommerceaction.action_type = '6'
GROUP EACH BY fv, v, sku, name, category, tid

you retrieve all fullvisitorids, their transactions and the correspondent products.

With the inner join only the transactions of your interest remain (fullvisitors that in a given transaction bought in category "animals" and name "elephant").

I hope this helps. Any issues let me know,