1
votes

I want my hadoop jobs to fetch data from the leaf nodes of the sub-directories. As such the the data will always be present only in the leaf nodes having a .dat extension.

to Illustrate a sub directory structure:

say a->b->1.dat, a->c->2.dat

I tried doing fs -put "a" directory on the to HDFS and then specify "a" as an input to the hadoop job but it fails. However the above method works fine if the dat files are within "a".

Any possible solution?

1

1 Answers

0
votes

Using Multiple Input Format we can read two files of different formats and the result of both combined goes to reducer job.

kindly look into below link.

https://github.com/subbu-m/MultipleInputFormat