1
votes

I need to get specific arrays from table in BigQuery. Then I want to reduce rows if some array from partitioned window already has all values of current array and some another values.

with t0 as (SELECT 1 as big_id, '101' as small_id,  0.99 as bottom, 1.03 top
      UNION ALL SELECT 1, '102', 1.05, 1.09
      UNION ALL SELECT 1, '103', 1.09, 1.13
      UNION ALL SELECT 1, '104', 1.2, 1.25
      UNION ALL SELECT 1, '105', 1.33, 1.39
      UNION ALL SELECT 2, '102', 1.05, 1.09
      UNION ALL SELECT 2, '103', 1.09, 1.13
      UNION ALL SELECT 2, '104', 1.2, 1.25
      UNION ALL SELECT 2, '105', 1.33, 1.39)
SELECT t0.big_id, row_number() OVER (PARTITION BY t0.big_id) group_id, ARRAY_AGG(t1.small_id) my_arrays FROM t0
CROSS JOIN t0 t1

WHERE t0.big_id = t1.big_id AND t1.top/t0.bottom BETWEEN 1 AND 1.15
GROUP BY t0.big_id, t0.small_id

I have a table with ids and top and bottom of confidence intervals. I want to compare all unique small_id pairs beginning from small_id with lower bottom. Unique pair means: do not need to compare 102 and 101 if 101 and 102 compared already. Then I need to group small_ids with similar confidence intervals into arrays. Then I need to reduce group if all ids matched in some bigger group in same partitioned window. small_id is not numeric. Just text string. So not possible directly compare small_id as numbers using <>.

These rows what I need to reduce because I got values in another arrays

enter image description here

How I need to modify my query to get expected output?

1
all those crosses and arrows are really not helping - please clarify what is your expected result in plain view and also what is the logic you are trying to implement in your query - Mikhail Berlyant
Thank you for reply! I added more details of expected result - Timogavk

1 Answers

1
votes

Below is for BigQuery Standard SQL

#standardsql
with t0 as (SELECT 1 as big_id, '101' as small_id,  0.99 as bottom, 1.03 top
  UNION ALL SELECT 1, '102', 1.05, 1.09
  UNION ALL SELECT 1, '103', 1.09, 1.13
  UNION ALL SELECT 1, '104', 1.2, 1.25
  UNION ALL SELECT 1, '105', 1.33, 1.39
  UNION ALL SELECT 2, '102', 1.05, 1.09
  UNION ALL SELECT 2, '103', 1.09, 1.13
  UNION ALL SELECT 2, '104', 1.2, 1.25
  UNION ALL SELECT 2, '105', 1.33, 1.39
), temp as (      
  SELECT t0.big_id, 
    row_number() OVER (PARTITION BY t0.big_id) group_id, 
    ARRAY_AGG(t1.small_id) my_arrays FROM t0
  CROSS JOIN t0 t1
  WHERE t0.big_id = t1.big_id AND t1.top/t0.bottom BETWEEN 1 AND 1.15
  GROUP BY t0.big_id, t0.small_id
)
select big_id, group_id, any_value(my_arrays) my_arrays 
from (
  select t1.*,
    ( select count(1)
      from t1.my_arrays id
      join t2.my_arrays id
      using(id)
      where t1.group_id != t2.group_id
    ) = array_length(t1.my_arrays) as flag
  from temp t1 
  join temp t2
  using (big_id)
)
group by big_id, group_id
having countif(flag) = 0     

with output

enter image description here