2
votes

I want to select all rows for which the timestamp column has the maximum value. The data looks like this:

A      B      timestamp
john   smith   2018
bob    dylan   2018
adam   levine  2017
bob    dylan   2017

The result should be:

A      B      timestamp
john   smith   2018
bob    dylan   2018

With Impala, the following SQL Query works: SELECT * FROM table WHERE timestamp=(SELECT Max(timestamp) from table)

But with Hive, the SQL Query doesn't.

2

2 Answers

4
votes

Please always include the error message.

Try with

SELECT * FROM table WHERE timestamp IN (SELECT Max(timestamp) from table)
0
votes

You can use rank :

SELECT
    D.A
    ,D.B
    ,D.timestamp
FROM 
    (
        SELECT
            A
            ,B
            ,timestamp
            ,RANK(timestamp) over ( order by timestamp DESC ) as rank_
        FROM 
            TABLE
    ) D 
WHERE
    D.rank_=1