0
votes

I am trying to join two tables in big query, Table1 contains an ID column, and Table2 contains a column which has the same ID or multiple ID's in the form of a long string separated by commas, like "id123,id456,id678"

I can join the tables together if Table1.ID = Table2.ID but this ignores all the rows where Table1.ID is one of the multiple IDs in Table2.ID. I have looked at similar post that tell me to use wildcards like

on concat('%',Table1.ID,'%') = Table2.ID

but this does not work, because it seems to create a string that contains the '%' character and doesn't actually use it as a wildcard.

I'm using standard sql in BigQuery, any help would be appreciated

2
Something like on Table2.ID like concat('%',Table1.ID,'%') ?jarlh
This is the error I got when using like instead "LEFT OUTER JOIN cannot be used without a condition that is an equality of fields from both sides of the join." Is it my type of join that is restricting this? Ideally need a left joinRyan

2 Answers

2
votes

Below example is for BigQuery Standard SQL

#standardSQL
WITH `project.dataset.table1` AS (
  SELECT 123 id, 'a' test UNION ALL
  SELECT 456, 'b' UNION ALL
  SELECT 678, 'c'
), `project.dataset.table2` AS (
  SELECT 'id123,id456' id UNION ALL
  SELECT 'id678'
)
SELECT t2.id, test
FROM `project.dataset.table2` t2, UNNEST(SPLIT(id)) id2
JOIN `project.dataset.table1` t1
ON CONCAT('id', CAST(t1.id AS STRING)) = id2  

result is as below

Row id              test     
1   id123,id456     a    
2   id123,id456     b    
3   id678           c    
0
votes

It is doubtful that you have values in the table that start and end with percentage signs. = does not recognize wildcards; like does:

on Table2.ID like concat('%', Table1.ID, '%')

As a warning. Such a construct is usually a performance killer. You would be better off trying to have columns in Table1 and Table2 that match exactly.