Consider the following table (simplified version):
id int,
amount decimal,
transaction_no,
location_id int,
created_at datetime
The above schema is used to store POS receipts for restaurants. Now, this table sometimes contains a receipt from same date, same transaction_no at same location_id.
In which case what I want to do is get the last receipt of that location_id & transaction_no order by created_at desc.
In MySQL, I use the following query which gets me last (max(created_at)
receipt for a location_id & transaction_no:
SELECT id, amount, transaction_no, location_id, created_at
FROM receipts r JOIN
(SELECT transaction_no, max(created_at) AS maxca
FROM receipts r
GROUP BY transaction_no
) t
ON r.transaction_no = t.transaction_no AND r.created_at = t.maxca
group by location_id;
But when I run the same in BigQuery, I get the following error:
Query Failed Error: Shuffle reached broadcast limit for table __I0 (broadcasted at least 150393576 bytes). Consider using partitioned joins instead of broadcast joins . Job ID: circular-gist-812:job_A_CfsSKJICuRs07j7LHVbkqcpSg
Any idea how to make the above query work in BigQuery?