I installed a pseudo-distributed version of Cloudera on my Linux box, and ran some simple MapReduce examples with success. However, I'm trying to get Oozie to work, and am completely baffled by the errors I am receiving when attempting to execute a simple job workflow:
tim@phocion:~$ oozie version
Oozie client build version: 3.1.3-cdh4.0.1
Copy the pre-packaged examples to HDFS and execute, per the documentation:
tim@phocion:~$ oozie job -oozie http://phocion:11000/oozie -config /user/tim/examples/apps/map-reduce/job.properties -run
Error: E0504 : E0504: App directory [hdfs://phocion:8020/user/tim/examples/apps/map-reduce] does not exist
Check to see if the file exists:
tim@phocion:~$ hdfs dfs -ls /user/tim/examples/apps/map-reduce
Found 3 items
-rwxr-xr-x 1 tim tim 995 2012-10-03 14:47 /user/tim/examples/apps/map-reduce/job.properties
drwxrwxr-x - tim tim 4096 2012-10-03 14:47 /user/tim/examples/apps/map-reduce/lib
-rwxr-xr-x 1 tim tim 2559 2012-10-03 14:47 /user/tim/examples/apps/map-reduce/workflow.xml
It does. Can I connect to phocion:8020?
tim@phocion:~$ telnet phocion 8020
Trying 127.0.1.1...
Connected to phocion.
Escape character is '^]'.
I can. So, basically, I'm at a total loss as to what this error is trying to tell me - the folder very much does exist. I'm assuming the error is too vague to fully communicate what the issue is, but I've found virtually nothing out there that could point me in the right direction.
I can also replicate this error with other 3rd party tutorials.
Spent much time pouring through configuration files to the point of not wanting to look at a computer ever again. Maybe I'm over thinking the issue here, but any help would be greatly appreciated.
EDIT: Adding the full job.properties (not too different from the default):
nameNode=hdfs://phocion:8020
jobTracker=phocion:8021
queueName=default
examplesRoot=examples
oozie.wf.application.path=${nameNode}/user/${user.name}/${examplesRoot}/apps/map-reduce
outputDir=map-reduce
MORE EDITS: I get the same exact error when the folder DOES NOT exist, and after I put if back into hdfs. Last-ditch idea that its a permissions issue, chmod 777 still gets the same error. Full HDFS path passed on the command line doesn't fix the issue. Running it under oozie and even root accounts don't work:
tim@phocion:~$ oozie job -oozie http://phocion:11000/oozie -run -config /home/tim/examples/apps/map-reduce/job.properties -Doozie.wf.application.path=hdfs://phocion:8020/user/tim/examples/apps/map-reduce
Error: E0504 : E0504: App directory [hdfs://phocion:8020/user/tim/examples/apps/map-reduce] does not exist
tim@phocion:~$ hdfs dfs -put examples/ /user/tim/
12/10/04 13:26:43 INFO util.NativeCodeLoader: Loaded the native-hadoop library
tim@phocion:~$ oozie job -oozie http://phocion:11000/oozie -run -config /home/tim/examples/apps/map-reduce/job.properties -Doozie.wf.application.path=hdfs://phocion:8020/user/tim/examples/apps/map-reduce
Error: E0504 : E0504: App directory [hdfs://phocion:8020/user/tim/examples/apps/map-reduce] does not exist
tim@phocion:~$ hdfs dfs -chmod -R 777 /user/tim/examples/
12/10/04 13:28:16 INFO util.NativeCodeLoader: Loaded the native-hadoop library
tim@phocion:~$ oozie job -oozie http://phocion:11000/oozie -run -config /home/tim/examples/apps/map-reduce/job.properties -Doozie.wf.application.path=hdfs://phocion:8020/user/tim/examples/apps/map-reduce
Error: E0504 : E0504: App directory [hdfs://phocion:8020/user/tim/examples/apps/map-reduce] does not exist
tim@phocion:~$ sudo -u oozie oozie job -oozie http://phocion:11000/oozie -run -config /home/tim/examples/apps/map-reduce/job.properties -Doozie.wf.application.path=hdfs://phocion:8020/user/tim/examples/apps/map-reduce
[sudo] password for tim:
Error: E0504 : E0504: App directory [hdfs://phocion:8020/user/tim/examples/apps/map-reduce] does not exist
tim@phocion:~$ sudo -u root oozie job -oozie http://phocion:11000/oozie -run -config /home/tim/examples/apps/map-reduce/job.properties -Doozie.wf.application.path=hdfs://phocion:8020/user/tim/examples/apps/map-reduce
Error: E0504 : E0504: App directory [hdfs://phocion:8020/user/tim/examples/apps/map-reduce] does not exist
Should this command work in theory?
tim@phocion:~$ hdfs dfs -ls hdfs://phocion:8020/user/tim/examples/apps/map-reduce
ls: `hdfs://phocion:8020/user/tim/examples/apps/map-reduce': No such file or directory
This shows up in hadoop-hdfs logs after executing the oozie command:
2012-10-04 13:50:00,152 INFO org.apache.hadoop.hdfs.server.namenode.FSEditLog: Starting log segment at 113297
2012-10-04 13:50:00,874 INFO org.apache.hadoop.hdfs.server.namenode.TransferFsImage: Opening connection to http://localhost.localdomain:50090/getimage?getimage=1&txid=113296&storageInfo=-40:2092007576:0:cluster8
2012-10-04 13:50:00,875 ERROR org.apache.hadoop.security.UserGroupInformation: PriviledgedActionException as:hdfs (auth:SIMPLE) cause:java.net.ConnectException: Connection refused
2012-10-04 13:50:00,876 WARN org.mortbay.log: /getimage: java.io.IOException: GetImage failed. java.net.ConnectException: Connection refused
at java.net.PlainSocketImpl.socketConnect(Native Method)
at java.net.PlainSocketImpl.doConnect(PlainSocketImpl.java:351)
at java.net.PlainSocketImpl.connectToAddress(PlainSocketImpl.java:213)
at java.net.PlainSocketImpl.connect(PlainSocketImpl.java:200)
at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:366)
at java.net.Socket.connect(Socket.java:529)
at java.net.Socket.connect(Socket.java:478)
at sun.net.NetworkClient.doConnect(NetworkClient.java:163)
at sun.net.www.http.HttpClient.openServer(HttpClient.java:395)
at sun.net.www.http.HttpClient.openServer(HttpClient.java:530)
at sun.net.www.http.HttpClient.<init>(HttpClient.java:234)
at sun.net.www.http.HttpClient.New(HttpClient.java:307)
at sun.net.www.http.HttpClient.New(HttpClient.java:324)
at sun.net.www.protocol.http.HttpURLConnection.getNewHttpClient(HttpURLConnection.java:970)
at sun.net.www.protocol.http.HttpURLConnection.plainConnect(HttpURLConnection.java:911)
at sun.net.www.protocol.http.HttpURLConnection.connect(HttpURLConnection.java:836)
at sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1172)
/etc/hadoop/conf/*-site.xml
) for the following property:fs.defaultFS
>> that's the prefix used whenever you don't explicitly mention "hdfs://host:port/home/dir/" or "hdfs://HA_alias/home/dir/" – Samson Scharfrichter