0
votes

I have a table that contains two columns, name and value. I would like to count the number of times each of the combination repeats in Hive

          Alex  Biology
          Joe  Chemistry
          Alex  Physics
          Alex  Physics
          Joe   Physics
          Joe   Physics

I expect an output like

          Alex  Biology  1
          Joe  Chemistry 1
          Alex  Physics  2
          Joe   Physics  2

I tried a SQL like count query select distinct name, value, count(*) from tbl But this fails with an error 'Not yet supported place for UDAF count' I tried using collect_set but was unsuccessful with it as well

        select x.name, x.value
           , num_arr
       from (
       select *
          , count(collect_set(name) over (partition by value)) num_arr
           from count_unique ) x
1

1 Answers

0
votes

You can used simple group_by function and count the combination. Try this

select x.name, x.value, count(*) from count_unique group by x.name,x.value;