I currently have a dataframe where one column is of type "a b c d e ...". Call this column "col4"
I would like to split a single row into multiple by splitting the elements of col4, preserving the value of all the other columns.
So, for example, given a df with single row:
col1[0] | col2[0] | col3[0] | a b c |
I would like the output to be:
col1[0] | col2[0] | col3[0] | a |
col1[0] | col2[0] | col3[0] | b |
col1[0] | col2[0] | col3[0] | c |
Using the split and explode functions, I have tried the following:
d = COMBINED_DF.select(col1, col2, col3, explode(split(my_fun(col4), " ")))
However, this results in the following output:
col1[0] | col2[0] | col3[0] | a b c |
col1[0] | col2[0] | col3[0] | a b c |
col1[0] | col2[0] | col3[0] | a b c |
which is not what I want.