0
votes

I am trying to update the historical table, but am getting a merge error. When I run this cell:

%%sql

select * from main
UNION
select * from historical
where Summary_Employee_ID=25148

I get a two row table that looks like: EmployeeID Name 25148 Wendy Clampett 25148 Wendy Monkey

I'm trying to update the Name... using the following merge command

%%sql
MERGE INTO main m
using historical h
on m.Employee_ID=h.Employee_ID
WHEN MATCHED THEN 
    UPDATE SET 
        m.Employee_ID=h.Employee_ID,
        m.Name=h.Name
WHEN NOT MATCHED THEN
    INSERT(Employee,Name)
    VALUES(h.Employee,h.Name)

Here's my error:

Error: mismatched input 'MERGE' expecting {'(', 'SELECT', 'FROM', 'ADD', 'DESC', 'WITH', 'VALUES', 'CREATE', 'TABLE', 'INSERT', 'DELETE', 'DESCRIBE', 'EXPLAIN', 'SHOW', 'USE', 'DROP', 'ALTER', 'MAP', 'SET', 'RESET', 'START', 'COMMIT', 'ROLLBACK', 'REDUCE', 'REFRESH', 'CLEAR', 'CACHE', 'UNCACHE', 'DFS', 'TRUNCATE', 'ANALYZE', 'LIST', 'REVOKE', 'GRANT', 'LOCK', 'UNLOCK', 'MSCK', 'EXPORT', 'IMPORT', 'LOAD'}(line 1, pos 0)

3

3 Answers

1
votes

Synapse doesn't support the sql merge, like databricks. However, you can use the python solution. Note historical was really my updates...

So for the above, I used:

import delta

main = delta.DeltaTable.forPath(spark,"path")

(main
    .alias("main")
    .merge(historical.alias("historical"),
    .whenMatchedUpdate(set = {main.Employee_ID=historical.Employee_ID})
    .whenNotMathcedInsert(values = 
        {"employeeID":"historical.employeeID","name"="historical.name})
.execute()
)
0
votes

Your goal is to upsert the target table historical, but as per your query the target table is set to main instead of historical and also the update statement set to main and insert statement set to historical

Try the following,

%%sql
MERGE INTO historical target
using main source
on source.Employee_ID=target.Employee_ID
WHEN MATCHED THEN 
    UPDATE SET 
        target.Name=source.Name
WHEN NOT MATCHED THEN
    INSERT(Employee,Name)
    VALUES(source.Employee,source.Name)
0
votes

It's supported in Spark 3.0 that's currently in preview, so this might be worth a try. I did see the same error on the Spark 3.0 pool, but it's quite misleading as it actually means that you're trying to merge on duplicate data or that you're offering duplicate data to the original set. I've validated this by querying the delta lake and the raw file for duplicates with the serverless SQL Pool and Polybase.