5
votes

I installed a pseudo-distributed version of Cloudera on my Linux box, and ran some simple MapReduce examples with success. However, I'm trying to get Oozie to work, and am completely baffled by the errors I am receiving when attempting to execute a simple job workflow:

tim@phocion:~$ oozie version
Oozie client build version: 3.1.3-cdh4.0.1

Copy the pre-packaged examples to HDFS and execute, per the documentation:

tim@phocion:~$ oozie job -oozie http://phocion:11000/oozie -config /user/tim/examples/apps/map-reduce/job.properties -run
Error: E0504 : E0504: App directory [hdfs://phocion:8020/user/tim/examples/apps/map-reduce] does not exist

Check to see if the file exists:

tim@phocion:~$ hdfs dfs -ls /user/tim/examples/apps/map-reduce
Found 3 items
-rwxr-xr-x   1 tim tim        995 2012-10-03 14:47 /user/tim/examples/apps/map-reduce/job.properties
drwxrwxr-x   - tim tim       4096 2012-10-03 14:47 /user/tim/examples/apps/map-reduce/lib
-rwxr-xr-x   1 tim tim       2559 2012-10-03 14:47 /user/tim/examples/apps/map-reduce/workflow.xml

It does. Can I connect to phocion:8020?

tim@phocion:~$ telnet phocion 8020
Trying 127.0.1.1...
Connected to phocion.
Escape character is '^]'.

I can. So, basically, I'm at a total loss as to what this error is trying to tell me - the folder very much does exist. I'm assuming the error is too vague to fully communicate what the issue is, but I've found virtually nothing out there that could point me in the right direction.

I can also replicate this error with other 3rd party tutorials.

Spent much time pouring through configuration files to the point of not wanting to look at a computer ever again. Maybe I'm over thinking the issue here, but any help would be greatly appreciated.

EDIT: Adding the full job.properties (not too different from the default):

nameNode=hdfs://phocion:8020
jobTracker=phocion:8021
queueName=default
examplesRoot=examples
oozie.wf.application.path=${nameNode}/user/${user.name}/${examplesRoot}/apps/map-reduce
outputDir=map-reduce

MORE EDITS: I get the same exact error when the folder DOES NOT exist, and after I put if back into hdfs. Last-ditch idea that its a permissions issue, chmod 777 still gets the same error. Full HDFS path passed on the command line doesn't fix the issue. Running it under oozie and even root accounts don't work:

tim@phocion:~$ oozie job -oozie http://phocion:11000/oozie -run -config /home/tim/examples/apps/map-reduce/job.properties -Doozie.wf.application.path=hdfs://phocion:8020/user/tim/examples/apps/map-reduce
Error: E0504 : E0504: App directory [hdfs://phocion:8020/user/tim/examples/apps/map-reduce] does not exist
tim@phocion:~$ hdfs dfs -put examples/ /user/tim/
12/10/04 13:26:43 INFO util.NativeCodeLoader: Loaded the native-hadoop library
tim@phocion:~$ oozie job -oozie http://phocion:11000/oozie -run -config /home/tim/examples/apps/map-reduce/job.properties -Doozie.wf.application.path=hdfs://phocion:8020/user/tim/examples/apps/map-reduce
Error: E0504 : E0504: App directory [hdfs://phocion:8020/user/tim/examples/apps/map-reduce] does not exist
tim@phocion:~$ hdfs dfs -chmod -R 777 /user/tim/examples/
12/10/04 13:28:16 INFO util.NativeCodeLoader: Loaded the native-hadoop library
tim@phocion:~$ oozie job -oozie http://phocion:11000/oozie -run -config /home/tim/examples/apps/map-reduce/job.properties -Doozie.wf.application.path=hdfs://phocion:8020/user/tim/examples/apps/map-reduce
Error: E0504 : E0504: App directory [hdfs://phocion:8020/user/tim/examples/apps/map-reduce] does not exist
tim@phocion:~$ sudo -u oozie oozie job -oozie http://phocion:11000/oozie -run -config /home/tim/examples/apps/map-reduce/job.properties -Doozie.wf.application.path=hdfs://phocion:8020/user/tim/examples/apps/map-reduce
[sudo] password for tim: 
Error: E0504 : E0504: App directory [hdfs://phocion:8020/user/tim/examples/apps/map-reduce] does not exist
tim@phocion:~$ sudo -u root oozie job -oozie http://phocion:11000/oozie -run -config /home/tim/examples/apps/map-reduce/job.properties -Doozie.wf.application.path=hdfs://phocion:8020/user/tim/examples/apps/map-reduce
Error: E0504 : E0504: App directory [hdfs://phocion:8020/user/tim/examples/apps/map-reduce] does not exist

Should this command work in theory?

tim@phocion:~$ hdfs dfs -ls hdfs://phocion:8020/user/tim/examples/apps/map-reduce
ls: `hdfs://phocion:8020/user/tim/examples/apps/map-reduce': No such file or directory

This shows up in hadoop-hdfs logs after executing the oozie command:

2012-10-04 13:50:00,152 INFO org.apache.hadoop.hdfs.server.namenode.FSEditLog: Starting log segment at 113297
2012-10-04 13:50:00,874 INFO org.apache.hadoop.hdfs.server.namenode.TransferFsImage: Opening connection to http://localhost.localdomain:50090/getimage?getimage=1&txid=113296&storageInfo=-40:2092007576:0:cluster8
2012-10-04 13:50:00,875 ERROR org.apache.hadoop.security.UserGroupInformation: PriviledgedActionException as:hdfs (auth:SIMPLE) cause:java.net.ConnectException: Connection refused
2012-10-04 13:50:00,876 WARN org.mortbay.log: /getimage: java.io.IOException: GetImage failed. java.net.ConnectException: Connection refused
        at java.net.PlainSocketImpl.socketConnect(Native Method)
        at java.net.PlainSocketImpl.doConnect(PlainSocketImpl.java:351)
        at java.net.PlainSocketImpl.connectToAddress(PlainSocketImpl.java:213)
        at java.net.PlainSocketImpl.connect(PlainSocketImpl.java:200)
        at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:366)
        at java.net.Socket.connect(Socket.java:529)
        at java.net.Socket.connect(Socket.java:478)
        at sun.net.NetworkClient.doConnect(NetworkClient.java:163)
        at sun.net.www.http.HttpClient.openServer(HttpClient.java:395)
        at sun.net.www.http.HttpClient.openServer(HttpClient.java:530)
        at sun.net.www.http.HttpClient.<init>(HttpClient.java:234)
        at sun.net.www.http.HttpClient.New(HttpClient.java:307)
        at sun.net.www.http.HttpClient.New(HttpClient.java:324)
        at sun.net.www.protocol.http.HttpURLConnection.getNewHttpClient(HttpURLConnection.java:970)
        at sun.net.www.protocol.http.HttpURLConnection.plainConnect(HttpURLConnection.java:911)
        at sun.net.www.protocol.http.HttpURLConnection.connect(HttpURLConnection.java:836)
        at sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1172)
3
As noted at the "Running the Examples" section of the Oozie docs at archive.cloudera.com/cdh4/cdh/4/oozie/…, can you try passing a proper local path to -config at the CLI instead of a HDFS-style path? The -config expects the properties to be a locally available, configurable file, not inside the HDFS.Harsh J
@Harsh-J Appreciate the help. I had -config point to the config file outside HDFS. Same error: tim@phocion:~$ oozie job -oozie phocion:11000/oozie -config /home/tim/examples/apps/map-reduce/job.properties -run Error: E0504 : E0504: App directory [hdfs://phocion:8020/user/tim/examples/apps/map-reduce] does not existphocion
Can you check the Hadoop config (/etc/hadoop/conf/*-site.xml) for the following property: fs.defaultFS >> that's the prefix used whenever you don't explicitly mention "hdfs://host:port/home/dir/" or "hdfs://HA_alias/home/dir/"Samson Scharfrichter
BTW, you should check the config on your edge node (the one you are running HDFS command-line on) but also on the node running Oozie server, if different, and also all cluster nodes that might be running the Oozie actions...Samson Scharfrichter

3 Answers

1
votes

In addition to HarshJ's comment, check your error message:

Error: E0504 : E0504: App directory [hdfs://phocion:8020/user/tim/examples/apps/demo] does not exist

And the hadoop fs -ls listing you provided:

/user/tim/examples/apps/map-reduce/

And play spot the difference:

/user/tim/examples/apps/demo
/user/tim/examples/apps/map-reduce/

try configuring as follows:

oozie.wf.application.path=/user/tim/examples/apps/map-reduce
0
votes

I had a same issue and got it fixed by exporting the correct oozie url.

To export you should use the below command

export OOZIE_URL=http://someip:11000/oozie

To get this oozie url you need to use hue to connect you cluster and navigate to Workflows where you can find a tab called oozie. Inside this you should see gauges where a lot of properties will be listed. Look for the property oozie.servers.

-1
votes

What you need to do is to -copyFromLocal the examples folder to the location specified in the jobs config.