I have a partitioned dataframe say df1. From df1 i will create df2 and df3..
df1 = df1.withColumn("key", concat("col1", "col2", "col3"))
df1 =df1.repartition(400, "key")
df2 = df.groupBy("col1", "col2").agg(sum(colx))
df3 = df1.join(df2, ["col1", "col2"])
I want to know will df3 retain same partition of df1? or do i need to re-partition df3 again?.