I have folders created under a base hdfs directory everytime a job runs. And under each folder there are .dat files.
I need to copy the .dat files to my base directory using scala and archive the sub-directories
For example. Base directory:- /user/srav/ Sub-directories:- /user/srav/20190101 /user/srav/20180101
I have .dat files in my sub-directories /user/srav/20190101/test1.dat, /user/srav/20180101/test2.dat I need to copy them under /user/srav/ and archive the 20190101, 20180101 folders. Please suggest on how we could implement this using spark/scala (spark ver 2.0)