2
votes

I've a simple bigquery table with 3 columns (and some example data) below:

|---------------------|------------------|------------|
|      Name           |     Time         |  Value     |
|---------------------|------------------|------------|
|          a          |         1        |   x        |
|---------------------|------------------|------------|
|          a          |         2        |   y        |
|---------------------|------------------|------------|
|          a          |         3        |   z        |
|---------------------|------------------|------------|
|          b          |         1        |   x        |
|---------------------|------------------|------------|
|          b          |         4        |   y        |
|---------------------|------------------|------------|

For each name, I'd like to return the value with the max time.

For the above table the 3rd and 5th row should be returned, e.g.,

|---------------------|------------------|------------|
|      Name           |     Time         |  Value     |
|---------------------|------------------|------------|
|          a          |         3        |   z        |
|---------------------|------------------|------------|
|          b          |         4        |   y        |
|---------------------|------------------|------------|

It is roughly like: (1) first group by Name, (2) find out the max time in each group, (3) identify the row with the max time.

Seems for (1) and (2), we can use group by + max(), but i'm not sure how to achieve the (3) step.

Anyone has ideas of what's the best query I can write to achieve this purpose.

Thanks a lot.

2
You have a low rate. Important on SO - you can mark accepted answer by using the tick on the left of the posted answer, below the voting. See meta.stackexchange.com/questions/5234/… for why it is important! Also important to vote on answer. Vote up answers that are helpful. ... You can check about what to do when someone answers your question - stackoverflow.com/help/someone-answers. Following these simple rules you increase your own reputation score and at the same time you keep us motivated to answer your questions :o) please consider! - Mikhail Berlyant

2 Answers

3
votes

Below is for BigQuery Standard SQL

#standardSQL
SELECT AS VALUE ARRAY_AGG(t ORDER BY time DESC LIMIT 1)[OFFSET(0)]
FROM `project.dataset.table` t 
GROUP BY name

if to apply to sample data from your question as in below example

#standardSQL
WITH `project.dataset.table` AS (
  SELECT 'a' name, 1 time, 'x' value UNION ALL
  SELECT 'a', 2, 'y' UNION ALL
  SELECT 'a', 3, 'z' UNION ALL
  SELECT 'b', 1, 'x' UNION ALL
  SELECT 'b', 4, 'y' 
)
SELECT AS VALUE ARRAY_AGG(t ORDER BY time DESC LIMIT 1)[OFFSET(0)]
FROM `project.dataset.table` t 
GROUP BY name   

result is

Row name    time    value    
1   a       3       z    
2   b       4       y   
1
votes

ROW_NUMBER is one way to go here:

SELECT Name, Time, Value
FROM
(
    SELECT *, ROW_NUMBER() OVER (PARTITION BY Name ORDER BY Time DESC) rn
    FROM yourTable
) t
WHERE rn = 1;