0
votes

I am trying simple split function in Bigquery SQL and trying to get the array elements (query taken from hive SQL). But Bigquery SQL split function is making the fields repeatable and does not give desired results. Can someone help create similar query in Bigquery #standardSQL

-- Hive query
select hierarchy, hier_array, hier_array[0] as level0, hier_array[1] as level1, hier_array[2] as level2 from
( select hierarchy, split(hierarchy,'-') as hier_array
from gcs_publish.cr_party_dnm_gu_rel ) z
limit 10;

--Output desired

hierarchy   hier_array  level0  level1
10000-211817-26510-25429    ["10000","211817","26510","25429"]  10000   211817
10019-10369 ["10019","10369"]   10019   10369
10021   ["10021"]   10021   
10022-17256 ["10022","17256"]   10022   17256
10033   ["10033"]   10033   
10037-3098187   ["10037","3098187"] 10037   3098187
10042   ["10042"]   10042   
10050-11038-211637808-34880075  ["10050","11038","211637808","34880075"]    10050   11038
10052   ["10052"]   10052   
10053   ["10053"]   10053   
1
Welcome to StackOverflow, please, visit tour and read how to ask a question. Please, accept the answer if that worked for you. - PeS

1 Answers

3
votes

Below is for BigQuery Standard SQL and to help you get started

#standardSQL
SELECT 
  hierarchy, 
  hier_array, 
  hier_array[SAFE_OFFSET(0)] AS level0,  
  hier_array[SAFE_OFFSET(1)] AS level1, 
  hier_array[SAFE_OFFSET(2)] AS level2
FROM (
  SELECT hierarchy, SPLIT(hierarchy,'-') AS hier_array  
  FROM `gcs_publish.cr_party_dnm_gu_rel`
) z  
LIMIT 10   

result will be something like below

Row hierarchy                   hier_array  level0  level1  level2   
1   10000-211817-26510-25429    10000       10000   211817  26510    
                                211817               
                                26510                
                                25429                
2   10019-10369                 10019       10019   10369   null     
                                10369                
3   10021                       10021       10021   null    null       

If you will check JSON representation of this result (first row as an example)

  {
    "hierarchy": "10000-211817-26510-25429",
    "hier_array": [
      "10000",
      "211817",
      "26510",
      "25429"
    ],
    "level0": "10000",
    "level1": "211817",
    "level2": "26510"
  },  

which looks to me as what you expect