I'm trying to do a left join in pyspark on two columns of which just one is named identical:
How could I drop both columns of the joined dataframe df2.date and df2.accountnr?
dfAll = (
df1
.join(df2,
[df1.order_date == df2.date, df1.accountnr== df2.accountnr],
how = 'left')
)
Adding
.drop(df2.date, df2.accountnr) (error: col should be a string) or
.drop('date', 'accountnr') (error: ambiguous column names) does unfortunately not work.
The only solution that I can come up with is first seperately renaming df2.accountnr to df2.accountnr2 and then use .drop('date', 'accountnr2')
Is there a better solution to directly drop the joining columns in pyspark?