hadoop pig cannot mkdir java throw IO exception

Question

I have a very simple script example from hadoop real world solution cookbook and I try it on amazon cloudera clustertogov04 ami and it gives me the java exception of not able to mkdir?? but I have enough disk space??

[ec2-user]$ df
Filesystem           1K-blocks      Used Available Use% Mounted on
/dev/xvde1             8255928   3307252   4529300  43% /
tmpfs                  3757068         0   3757068   0% /dev/shm
/dev/xvdk            103212320    192116  97777324   1% /data

heres the script,command,error output

weblogs = load '/data2/weblogs/weblog_entries.txt' as
(md5:chararray,
url:chararray,
date:chararray,
time:chararray,
ip:chararray);
md5_grp = group weblogs by md5 parallel 4;
store md5_grp into '/data/weblogs/weblogs_md5_groups.bcp';


pig -x local -f pig02 2>err02

2013-06-20 19:57:29,499 [Thread-4] INFO org.apache.hadoop.mapred.Merger - Down to the last merge-pass, with 1 segments left of total size: 299132 bytes 2013-06-20 19:57:29,499 [Thread-4] INFO org.apache.hadoop.mapred.LocalJobRunner - 2013-06-20 19:57:29,519 [Thread-4] WARN org.apache.hadoop.mapred.LocalJobRunner - job_local_0001 java.io.IOException: Mkdirs failed to create file:/data/weblogs/weblogs_md5_groups.bcp/_temporary/_attempt_local_0001_r_000000_0 at org.apache.hadoop.fs.ChecksumFileSystem.create(ChecksumFileSystem.java:434) at org.apache.hadoop.fs.ChecksumFileSystem.create(ChecksumFileSystem.java:420) at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:805) at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:786) at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:685) at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigTextOutputFormat.getRecordWriter(PigTextOutputFormat.java:98) at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputFormat.getRecordWriter(PigOutputFormat.java:84) at org.apache.hadoop.mapred.ReduceTask.runNewReducer(ReduceTask.java:582) at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:433) at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:309) 2013-06-20 19:57:33,176 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - job job_local_0001 has failed! Stop running all dependent jobs
2013-06-20 19:57:33,180 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 100% complete
2013-06-20 19:57:33,182 [main] ERROR org.apache.pig.tools.pigstats.PigStatsUtil - 1 map reduce job(s) failed!
2013-06-20 19:57:33,182 [main] INFO org.apache.pig.tools.pigstats.SimplePigStats - Detected Local mode. Stats reported below may be incomplete
2013-06-20 19:57:33,185 [main] INFO org.apache.pig.tools.pigstats.SimplePigStats - Script Statistics:

HadoopVersion PigVersion UserId StartedAt FinishedAt Features
2.0.0-cdh4.1.2 0.10.0-cdh4.1.2 ec2-user 2013-06-20 19:57:27 2013-06-20 19:57:33 GROUP_BY

Failed!

Pig Stack Trace ---------------
ERROR 2244: Job failed, hadoop does not return any error message

org.apache.pig.backend.executionengine.ExecException: ERROR 2244: Job failed, hadoop does not return any error message at org.apache.pig.tools.grunt.GruntParser.executeBatch(GruntParser.java:140) at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:193)
at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:165) at org.apache.pig.tools.grunt.Grunt.exec(Grunt.java:84)
at org.apache.pig.Main.run(Main.java:430)
at org.apache.pig.Main.main(Main.java:111) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597)

at org.apache.hadoop.util.RunJar.main(RunJar.java:208)

Can you check if pig has the permission to write "/data/weblogs/" directory? — zsxwing

seedhead seedhead · Accepted Answer · 2013-06-20T20:19:48

Looks like your Hadoop Job can't create the directory you've specified in your STORE

Have you tried storing the output to a different location such as your home directory?

Also FYI, Pig won't save its output to a file called "weblogs_md5_groups.bcp", it will actually be creating a directory with that name.

hadoop pig cannot mkdir java throw IO exception

at org.apache.hadoop.util.RunJar.main(RunJar.java:208)

1 Answers