0
votes

I have a dataframe in which I do a drop and join to change the value of a column. Once after doing this change the position of the dataframe gets changed and I build the schema dynamically according the table. Since the dataframe and schema not matching the insertion fails.

Example:

df = select 'yes' as x,a, b, c, d from aaaa, bbbb 
originaldf = select a, b, c, d from aaaa
temp1 = df.drop(x)
join = originaldf.except(tempdf)
temp2 = join.drop(c)
temp2.withColumn('c', df('x'))

I will now apply the schema for temp2, but the temp2 is now become c, a, b, d instead of a, b, c, d. Is it there a way to re-arrange it in the DataFrame or anywhere else?

Thanks

1

1 Answers

1
votes

Just select:

>>> temp2.withColumn('c', df('x')).select("a", "b", "c", "d")

or

>>> temp3 = temp2.withColumn('c', df('x'))
>>> temp3.select(sorted(temp3.columns))