My order import from Shopify creates a new entry in BigQuery for every order if something has changed since the last import, this way you could see how the order properties change over time and not just the last import state. This also creates multiple entries in the table for the same order where the only unique part is the _sdc_batched_at
and sdc_sequence
values. I'm seeing sometimes as many as 30 entries of the same order.
Table schema...
order:
order_number: Int
fulfillments: Array
_sdc_batched_at: DateTime
_sdc_sequence: Int
What I've done...
I've created a partitioned table that essentially boils down to a subset of entries between a given date range and where fulfillments > 0
Initial Query to reduce dataset...
with orders as (
select order_number, fulfillments, _sdc_batched_at, _sdc_sequence
from `project.shopify.orders`
where created_at between '2018-11-08' and '2018-11-15'
and ARRAY_LENGTH(fulfillments) > 0
)
The problem...
I'm running into issues trying to use distinct or group by since fulfillments is an array and that throws things off. How do I write a query that will only return the latest order entry by _sdc_batched_at
value?
Sample Data
[
{
"order_number": "5545",
"fulfillments": [
{
"tracking_url": null,
"id": "617029074993",
"tracking_company": "ups",
"tracking_number": "Z1234567890"
}
],
"_sdc_batched_at": "2018-11-10 02:46:21.270 UTC",
"_sdc_sequence": "1541817507934"
},
{
"order_number": "5545",
"fulfillments": [
{
"tracking_url": null,
"id": "617029074993",
"tracking_company": "ups",
"tracking_number": "Z1234567890"
}
],
"_sdc_batched_at": "2018-11-10 03:16:16.606 UTC",
"_sdc_sequence": "1541819139795"
},
{
"order_number": "5545",
"fulfillments": [
{
"tracking_url": null,
"id": "617029074993",
"tracking_company": "ups",
"tracking_number": "Z1234567890"
}
],
"_sdc_batched_at": "2018-11-10 03:46:12.704 UTC",
"_sdc_sequence": "1541821046476"
},
{
"order_number": "5545",
"fulfillments": [
{
"tracking_url": null,
"id": "617029074993",
"tracking_company": "ups",
"tracking_number": "Z1234567890"
}
],
"_sdc_batched_at": "2018-11-10 04:16:07.952 UTC",
"_sdc_sequence": "1541822755508"
},
{
"order_number": "2212",
"fulfillments": [
{
"tracking_url": null,
"id": "617029074993",
"tracking_company": "ups",
"tracking_number": "Z1234567890"
}
],
"_sdc_batched_at": "2018-11-10 03:46:12.704 UTC",
"_sdc_sequence": "1541821046476"
},
{
"order_number": "2212",
"fulfillments": [
{
"tracking_url": null,
"id": "617029074993",
"tracking_company": "ups",
"tracking_number": "Z1234567890"
}
],
"_sdc_batched_at": "2018-11-10 04:1:07.952 UTC",
"_sdc_sequence": "1541822755508"
}
]
Expected Result
Return only the latest entry by _sdc_batched_at
value
{
"order_number": "5545",
"fulfillments": [
{
"tracking_url": null,
"id": "617029074993",
"tracking_company": "ups",
"tracking_number": "Z1234567890"
}
],
"_sdc_batched_at": "2018-11-10 04:16:07.952 UTC",
"_sdc_sequence": "1541822755508"
},
{
"order_number": "2212",
"fulfillments": [
{
"tracking_url": null,
"id": "617029074993",
"tracking_company": "ups",
"tracking_number": "Z1234567890"
}
],
"_sdc_batched_at": "2018-11-10 04:1:07.952 UTC",
"_sdc_sequence": "1541822755508"
}