1
votes

I am trying to get quantiles of a field and I want to explode them so that each value is a separate row rather than all of them forming a single array. First, I calculate 20 quantiles as below:

select percentile_approx(probability, 
                                       array(0.05, 0.1, 0.15, 0.2, 0.25, 0.3, 0.35, 
                                             0.4, 0.45, 0.5, 0.55, 0.6, 0.65, 0.7, 0.75, 
                                             0.8, 0.85, 0.9, 0.95, 1.0)) as quantiles
from my_table

The above code gives the array below:

[0.17808226409449213, 0.18250386256254247, 0.18525207046272224, 0.18800918537059694, 0.1907743631982954, 0.200154288105411, 0.30419108474685375, 0.3299437131426226, 0.352433633041806, 0.3589875791100745, 0.37581775428218006, 0.3825168120904496, 0.3966342376441502, 0.4173753044627164, 0.43268994899295316, 0.44015098935735575, 0.461413042176578, 0.4720422104416653, 0.487852850513824, 0.5050010622123932]

But since I wanted to explode it, I tried using lateral view posexplode like below (actually, I passed the output from the above code):

select i, x
lateral view posexplode([0.17808226409449213, 0.18250386256254247,0.18525207046272224, 0.18800918537059694, 
                    0.1907743631982954, 0.200154288105411, 0.30419108474685375, 0.3299437131426226, 
                    0.352433633041806, 0.3589875791100745, 0.37581775428218006, 0.3825168120904496, 
                    0.3966342376441502, 0.4173753044627164, 0.43268994899295316, 0.44015098935735575, 
                    0.461413042176578, 0.4720422104416653, 0.487852850513824, 0.5050010622123932]) q as i, x

but it gives the error message below:

ParseException: "\nextraneous input '[' expecting {'(', ')', 'SELECT', 'FROM', 'ADD', 'AS', 'ALL', 'ANY', 'DISTINCT', 'WHERE', 'GROUP', 'BY', 'GROUPING', 'SETS', 'CUBE', 'ROLLUP', 'ORDER', 'HAVING', 'LIMIT', 'AT', 'OR', 'AND', 'IN', NOT, 'NO', 'EXISTS', 'BETWEEN', 'LIKE', RLIKE, 'IS', 'NULL', 'TRUE', 'FALSE', 'NULLS', 'ASC', 'DESC', 'FOR', 'INTERVAL', 'CASE', 'WHEN', 'THEN', 'ELSE', 'END', 'JOIN', 'CROSS', 'OUTER', 'INNER', 'LEFT', 'SEMI', 'RIGHT', 'FULL', 'NATURAL', 'ON', 'PIVOT', 'LATERAL', 'WINDOW', 'OVER', 'PARTITION', 'RANGE', 'ROWS', 'UNBOUNDED', 'PRECEDING', 'FOLLOWING', 'CURRENT', 'FIRST', 'AFTER', 'LAST', 'ROW', 'WITH', 'VALUES', 'CREATE', 'TABLE', 'DIRECTORY', 'VIEW', 'REPLACE', 'INSERT', 'DELETE', 'INTO', 'DESCRIBE', 'EXPLAIN', 'FORMAT', 'LOGICAL', 'CODEGEN', 'COST', 'CAST', 'SHOW', 'TABLES', 'COLUMNS', 'COLUMN', 'USE', 'PARTITIONS', 'FUNCTIONS', 'DROP', 'UNION', 'EXCEPT', 'MINUS', 'INTERSECT', 'TO', 'TABLESAMPLE', 'STRATIFY', 'ALTER', 'RENAME', 'ARRAY', 'MAP', 'STRUCT', 'COMMENT', 'SET', 'RESET', 'DATA', 'START', 'TRANSACTION', 'COMMIT', 'ROLLBACK', 'MACRO', 'IGNORE', 'BOTH', 'LEADING', 'TRAILING', 'IF', 'POSITION', 'EXTRACT', '+', '-', '*', 'DIV', '~', 'PERCENT', 'BUCKET', 'OUT', 'OF', 'SORT', 'CLUSTER', 'DISTRIBUTE', 'OVERWRITE', 'TRANSFORM', 'REDUCE', 'SERDE', 'SERDEPROPERTIES', 'RECORDREADER', 'RECORDWRITER', 'DELIMITED', 'FIELDS', 'TERMINATED', 'COLLECTION', 'ITEMS', 'KEYS', 'ESCAPED', 'LINES', 'SEPARATED', 'FUNCTION', 'EXTENDED', 'REFRESH', 'CLEAR', 'CACHE', 'UNCACHE', 'LAZY', 'FORMATTED', 'GLOBAL', TEMPORARY, 'OPTIONS', 'UNSET', 'TBLPROPERTIES', 'DBPROPERTIES', 'BUCKETS', 'SKEWED', 'STORED', 'DIRECTORIES', 'LOCATION', 'EXCHANGE', 'ARCHIVE', 'UNARCHIVE', 'FILEFORMAT', 'TOUCH', 'COMPACT', 'CONCATENATE', 'CHANGE', 'CASCADE', 'RESTRICT', 'CLUSTERED', 'SORTED', 'PURGE', 'INPUTFORMAT', 'OUTPUTFORMAT', DATABASE, DATABASES, 'DFS', 'TRUNCATE', 'ANALYZE', 'COMPUTE', 'LIST', 'STATISTICS', 'PARTITIONED', 'EXTERNAL', 'DEFINED', 'REVOKE', 'GRANT', 'LOCK', 'UNLOCK', 'MSCK', 'REPAIR', 'RECOVER', 'EXPORT', 'IMPORT', 'LOAD', 'ROLE', 'ROLES', 'COMPACTIONS', 'PRINCIPALS', 'TRANSACTIONS', 'INDEX', 'INDEXES', 'LOCKS', 'OPTION', 'ANTI', 'LOCAL', 'INPATH', STRING, BIGINT_LITERAL, SMALLINT_LITERAL, TINYINT_LITERAL, INTEGER_VALUE, DECIMAL_VALUE, DOUBLE_LITERAL, BIGDECIMAL_LITERAL, IDENTIFIER, BACKQUOTED_IDENTIFIER}(line 3, pos 24)\n\n== SQL ==\n\nselect i, x\nlateral view posexplode([0.17808226409449213, 0.18250386256254247,0.18525207046272224, 0.18800918537059694, \n------------------------^^^\n 0.1907743631982954, 0.200154288105411, 0.30419108474685375, 0.3299437131426226, \n 0.352433633041806, 0.3589875791100745, 0.37581775428218006, 0.3825168120904496, \n 0.3966342376441502, 0.4173753044627164, 0.43268994899295316, 0.44015098935735575, \n 0.461413042176578, 0.4720422104416653, 0.487852850513824, 0.5050010622123932]) q as i, x\n"

On the other hand, if I create the array inside posexplode like below, it works fine:

select i, x
lateral view posexplode(array('a', 'b', 'c')) q as i, x

|  i|  x|
+---+---+
|  0|  a|
|  1|  b|
|  2|  c|
+---+---+
1
You cannot create array using square brackets. Such syntax is not suported. Use array() instead as in your last query.leftjoin
That is the output from the first code chunk. It creates the quantiles with square brackets.Fisseha Berhane
what I basically want is lateral vie posexplode(percentile_approx(probability, ....Fisseha Berhane

1 Answers

1
votes
select s.*, e.i, e.x
from
    (
    select percentile_approx(probability, 
                               array(0.05, 0.1, 0.15, 0.2, 0.25, 0.3, 0.35, 
                                    0.4, 0.45, 0.5, 0.55, 0.6, 0.65, 0.7, 0.75, 
                                    0.8, 0.85, 0.9, 0.95, 0.99)) as quantiles
    from my_table
    ) s lateral view outer posexplode (s.quantiles) e as i, x
;