3
votes

I get an error when I execute the following line of code:

deltaTarget.alias('target').merge(df.alias('source'), mergeStatement).whenMatchedUpdateAll().whenNotMatchedInsertAll().execute()

The error is the following:

AnalysisException: cannot resolve new_column in UPDATE clause given columns {List of target columns}. The 'new_column' is indeed not in the schema of the target delta table, but according to the documentation, this should just update the existing schema of the delta table and add the column.

I also enable the autoMerge with this command:

spark.conf.set("spark.databricks.delta.schema.autoMerge.enabled ","true")

I am not sure what exactly causes this error because in the past I was able to evolve the schema of delta tables automatically with these exact pieces of code.

Is there something that I am overlooking?

2
Can you please share what databricks runtime are you using ? - alonisser
In my case this turned to be a wrong error message hiding the real issue. (and yes, it's a bug to display the wrong error message) - which was mismatching types between two fields I tried to join on in the merge into command - alonisser

2 Answers

0
votes

If I'm not mistaken you need to use the insertAll or updateAll options on the MERGE operation

0
votes

I have the same problem with you, but i find that in delta lake docs, it may not likely support the part columns with upsertAll() and insertAll(); So i choose the upsertExpr() and insertExpr() with a big map contains all the columns.

delta lake merge : Schema validation