My spark program has to read from a directory, This directory has data of different schema
Dir/subdir1/files
1,10, Alien
1,11, BobDir/subdir2/files
2,blue, 123, chicago
2,red, 34, Dallas
Around 50 more directories with different schemas.
My spark job has to read data from all these directories and generate a file merging this files as shown below
1, 10, Alien;
1, 11, Bob;
2, blue, 123,chicago;
2, red, 34, Dallas;
Spark data frame expects schema to be same in all directories. is there any way I can read all these files of different schema and merge into single file using spark