0
votes

There is a string column converted from float(double) one in a hive table. I need a table that represents number of digits after dot with count rows for each number.

+-----+------+--+
| num | _c0  |
+-----+------+--+
|  2  | 300  |
|  3  | 400  |
|  4  | 248  |
|  5  | 117  |
|  6  |  43  |
| NULL| 999  |
+-----+------+--+

There is a function to obtain number of digits after dot in column foo

length(split(foo, '\\.')[1])

So, my failed attempt to obtain the above table was

select length(split(foo, '\\.')[1]) as num, count(num) from tbl_bar group by num;

The error message was

Error: Error while compiling statement: FAILED: SemanticException [Error 10004]: Line 1:77 Invalid table alias or column reference 'num': (possible column names are: foo, moo, hroo) (state=42000,code=10004)

What is the correct query to get distribution by number of digits after the dot in column foo?

1

1 Answers

1
votes

Column-aliases can't be selected in the query at the same level. Use the actual calculation instead.

select length(split(foo, '\\.')[1]) as num, count(*) 
from tbl_bar 
group by length(split(foo, '\\.')[1]);