I need help regarding mapfile reader.
I add the files into the cache by -files options
yarn jar HadoopProjects.jar rsProject.driver -files hdfs://localhost:8020/data/mapFileTestFolder.tar.gz....
here I call it
@SuppressWarnings("deprecation")
@Override
protected void setup(Context context) {
try {
Path[] cacheLocalFiles = DistributedCache.getLocalCacheFiles(context.getConfiguration());
logF.info("reducer started setup");
for (Path path:cacheLocalFiles) {
logF("reducer setup " + path.getName().toString());
if (path.getName().toString().contains("mapFileTestFolder.tar.gz")) {
URI mapUri = new File(path.toString() + "/mapFileTestFolder").toURI();
logF.info("depReader init begins URI = " + mapUri.toString());
depReader = new MapFile.Reader(FileSystem.get(context.getConfiguration()),mapUri.toString(), context.getConfiguration());
logF.info("depReader init ends");
}
}
} catch (IOException e) {
e.printStackTrace();
logF.info("depReader init error - " + e);
}
//some other lines
}
Here what I see in logs
2014-03-11 08:31:09,305 INFO [main] rsProject.myReducer: depReader init begins URI = file:/home/hadoop/Training/hadoop_work/mapred/nodemanager/usercache/hadoop/appcache/application_1394318775013_0079/container_1394318775013_0079_01_000005/mapFileTestFolder.tar.gz/mapFileTestFolder
2014-03-11 08:31:09,345 INFO [main] rsProject.myReducer: depReader init error - java.io.FileNotFoundException: File file:/home/hadoop/Training/hadoop_work/mapred/nodemanager/usercache/hadoop/appcache/application_1394318775013_0079/container_1394318775013_0079_01_000005/mapFileTestFolder.tar.gz/mapFileTestFolder/data does not exist
mapFileTestFolder.tar.gz - this is a compressed map file file (with index and data in it)
I guess this file exists in the distributed cache as the runner goes into the condition if the same matches.
Why does this happen? =/
Any help is appreciated
Thanks