Let's say I have a spark data frame df1, with several columns (among which the column id) and data frame df2 with two columns, id and other.
Is there a way to replicate the following command
sqlContext.sql("SELECT df1.*, df2.other FROM df1 JOIN df2 ON df1.id = df2.id")
by using only pyspark functions such as join(), select() and the like?
I have to implement this join in a function and I don't want to be forced to have sqlContext as a function parameter.
Thanks!