2
votes

I have 2 dataframes : df1 and df2 and I am left joining both of them on id column and saving it to another dataframe named df3. Below is the code that I am using, which works fine as expected.

val df3 = df1.alias("tab1").join(df2.alias("tab2"),Seq("id"),"left_outer").select("tab1.*","tab2.name","tab2.dept","tab2.descr");

I would like to rename the tab2.descr column to dept_full_description within the above statement.

I am aware that I could create a seq val like below and use toDF method

val columnsRenamed = Seq("id", "empl_name", "name","dept","dept_full_description") ;
df4 = df3.toDF(columnsRenamed: _*);

Is there any other way to to aliasing in the first statement itself. My end goal is not to list about 30-40 columns explicitly .

1

1 Answers

3
votes

I'd rename before join:

df1.alias("tab1").join(
   df2.withColumnRenamed("descr", "dept_full_description").alias("tab2"),
   Seq("id"), "left_outer")