I can select all the distinct values in a column in the following ways:
SELECT DISTINCT column_name FROM table_name;
SELECT column_name FROM table_name GROUP BY column_name;
But how do I get the row count from that query? Is a subquery required?
You can use the DISTINCT
keyword within the COUNT
aggregate function:
SELECT COUNT(DISTINCT column_name) AS some_alias FROM table_name
This will count only the distinct values for that column.
To do this in Presto using OVER
:
SELECT DISTINCT my_col,
count(*) OVER (PARTITION BY my_col
ORDER BY my_col) AS num_rows
FROM my_tbl
Using this OVER
based approach is of course optional. In the above SQL, I found specifying DISTINCT
and ORDER BY
to be necessary.
Caution: As per the docs, using GROUP BY
may be more efficient.