We are writing the data from MySql to BigQuery. We have set some indicators like
- Insert - If the records is being added first time, then save it with 'I' in Indicator field
- Update - If the record has some updated data, then save it with 'U' in the Indicator field and ignore duplicate records if not changed.
But in case of 'Update' it's writing duplicated records as well, which has not even changed. Following is the query we are currently using to insert the data into BigQuery table. What changes can we made to this query?
"insert into `actual_table`
(
Id,
...
)
select
temp.Id,
...
case when actual.Id is null then 'I'
when actual.Id is not null and actual.field1<>temp.field1 then 'U'
end as Indicator,
FROM `temp_table` temp
left outer join `actual_table` actual
on temp.Id= actual.Id"
Actual table is the table in BigQuery whereas temp table is the staging table on bigquery. Everytime we read data from MySql, we store it in temp table.
Thanks