I have a problem when trying to map data from a table with data from another table in Google BigQuery. To make it simple:
I have the dataTable with column data_modelname, data_version, data_date.
The mappingTable contains mapping_modelname, mapping_version, mapping_date.
When I try this code,
UPDATE dataTable
SET
dataTable.data_date = mappingTable.data_date,
dataTable.data_version = mappingTable.data_version
FROM mappingTable
WHERE
LOWER(data_modelname) IN (
SELECT DISTINCT LOWER(mapping_modelname) AS distinctModel
FROM mappingTable
WHERE mappingTable.data_version IS NOT NULL
)
I am getting the following error message from BigQuery:
UPDATE/MERGE must match at most one source row for each target row
The select distinct statement returns the list of ~300 distinct models. Thus, when considering the WHERE-condition, there will be only one result where e.g., LOWER(data_modelname) is LOWER(ABC123).
Unfortunately my mappingTable contains more than one entry per model (duplicates), but I only want the first found entry. Updating with LIMIT 1 did not work.
I also tried this code, with a mapping table with less entries and thus no duplicates. This worked well. However, it returns the same error when there is more than one modelname.
WHERE LOWER(data_modelname) = LOWER(mapping_modelname)
Could you please help me to insert (update) the from the mappingTable.data_date and mappingTable.data_version from the mapping table into the data table where the data_modelname is mapping_modelname.
Further it would be great to be able to use some wildcards or regular expressions for mapping the modelnames, e.g. WHERE LOWER(data_modelname) IS LIKE '% mapping_modelname %'.
Thanks a lot.
Edit, new problem:
I now came across another problem. Same situation and tables, but what I now need to do is to update the field data_date_Installed where model and data_version_Installed are the same. So in my data-table there is a version-field "3.4" and data_model "ABC123". In the mapping-table there is also mapping_model "ABC123" and many different versions in the mapping_version_Installed, and of course also "3.4" and the corresponding date of that version in the field mapping_date_installed.
I have tried this code, however it only matches 88 values out of 1m.
UPDATE dataTable dt
SET dt.data_date_installed = mapping_date_installed,
FROM (SELECT ARRAY_AGG(mt LIMIT 1)[ORDINAL(1)].*
FROM mappingTable mt
WHERE mt.data_installed_version IS NOT NULL
GROUP BY LOWER(mt.mapping_modelname)
) mt
WHERE LOWER(mt.mapping_modelname) = LOWER(dt.data_modelname)
AND
data_version_Installed = mapping_version_Installed
Actually I can exclude an error in the tables: mapping values are correct and data-type is also correct (date = DATE and version = STRING) in both tables. Also comparing the strings with LIKE instead of = did not result in more matching values.
Maybe the GROUP BY statement omits many possible installed_versions.
I am very happy and thankful, if you have some advice.
Edit, SOLVED:
It was possible to use the problem with simple SQL WHERE:
WHERE LOWER(mt.mapping_model) = LOWER(FINAL_modelName)
AND
data_date_installed = mt.mapping_date_installed