Delta lake merge doesn't update schema (automatic schema evolution enabled)

Question

I get an error when I execute the following line of code:

deltaTarget.alias('target').merge(df.alias('source'), mergeStatement).whenMatchedUpdateAll().whenNotMatchedInsertAll().execute()

The error is the following:

AnalysisException: cannot resolve new_column in UPDATE clause given columns {List of target columns}. The 'new_column' is indeed not in the schema of the target delta table, but according to the documentation, this should just update the existing schema of the delta table and add the column.

I also enable the autoMerge with this command:

spark.conf.set("spark.databricks.delta.schema.autoMerge.enabled ","true")

I am not sure what exactly causes this error because in the past I was able to evolve the schema of delta tables automatically with these exact pieces of code.

Is there something that I am overlooking?

Can you please share what databricks runtime are you using ? — alonisser
In my case this turned to be a wrong error message hiding the real issue. (and yes, it's a bug to display the wrong error message) - which was mismatching types between two fields I tried to join on in the merge into command — alonisser

Darryll D. Petrancuri Darryll D. Petrancuri · Accepted Answer · 2020-11-29T18:23:55

0

votes

If I'm not mistaken you need to use the insertAll or updateAll options on the MERGE operation

Delta lake merge doesn't update schema (automatic schema evolution enabled)

2 Answers