I have a Dataframe named df with following structure:
root
|-- country: string (nullable = true)
|-- competition: string (nullable = true)
|-- competitor: array (nullable = true)
|-- element: struct (containsNull = true)
| |-- name: string (nullable = true)
|-- time: string (nullable = true)
Which looks like this:
|country|competiton|competitor |
|___________________________________|
|USA |WN |[{Adam, 9.43}] |
|China |FN |[{John, 9.56}] |
|China |FN |[{Adam, 9.48}] |
|USA |MNU |[{Phil, 10.02}] |
|... |... |... |
I want to pivot (or something similar) the competitor Column into new columns depending on the values in each struct so it looks like this:
|country|competition|Adam|John|Phil |
|____________________________________|
|USA |WN |9.43|... |... |
|China |FN |9.48|9.56|... |
|USA |MNU |... |... |10.02 |
The names are unique so if column already exists i dont want to create a new but fill the value in the already created one. There are alot of names so it needs to be done dynamically.
I have a pretty big dataset so i cant use Pandas.