I have installed cloudera cdh4 release And I am trying to run the mapreduce job on that. I am getting following error -->
2012-07-09 15:41:16 ZooKeeperSaslClient [INFO] Client will not SASL-authenticate because the default JAAS configuration section 'Client' could not be found. If you are not using SASL, you may ignore this. On the other hand, if you expected SASL to work, please fix your JAAS configuration.
2012-07-09 15:41:16 ClientCnxn [INFO] Socket connection established to Cloudera/192.168.0.102:2181, initiating session
2012-07-09 15:41:16 RecoverableZooKeeper [WARN] Possibly transient ZooKeeper exception: org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode = ConnectionLoss for /hbase/master
2012-07-09 15:41:16 RetryCounter [INFO] The 1 times to retry after sleeping 2000 ms
2012-07-09 15:41:16 ClientCnxn [INFO] Session establishment complete on server Cloudera/192.168.0.102:2181, sessionid = 0x1386b0b44da000b, negotiated timeout = 60000
2012-07-09 15:41:18 TableOutputFormat [INFO] Created table instance for exact_custodian
2012-07-09 15:41:18 NativeCodeLoader [WARN] Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
2012-07-09 15:41:18 JobSubmitter [WARN] Use GenericOptionsParser for parsing the arguments. Applications should implement Tool for the same.
2012-07-09 15:41:18 JobSubmitter [INFO] Cleaning up the staging area file:/tmp/hadoop-hdfs/mapred/staging/hdfs48876562/.staging/job_local_0001
2012-07-09 15:41:18 UserGroupInformation [ERROR] PriviledgedActionException as:hdfs (auth:SIMPLE) cause:java.io.FileNotFoundException: File does not exist: /home/cloudera/yogesh/lib/hbase.jar
Exception in thread "main" java.io.FileNotFoundException: File does not exist: /home/cloudera/yogesh/lib/hbase.jar
at org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:736)
at org.apache.hadoop.mapreduce.filecache.ClientDistributedCacheManager.getFileStatus(ClientDistributedCacheManager.java:208)
at org.apache.hadoop.mapreduce.filecache.ClientDistributedCacheManager.determineTimestamps(ClientDistributedCacheManager.java:71)
at org.apache.hadoop.mapreduce.JobSubmitter.copyAndConfigureFiles(JobSubmitter.java:246)
at org.apache.hadoop.mapreduce.JobSubmitter.copyAndConfigureFiles(JobSubmitter.java:284)
at org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:355)
at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1226)
at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1223)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:396)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1232)
at org.apache.hadoop.mapreduce.Job.submit(Job.java:1223)
at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:1244)
at
I am able to run sample programs given in hadoop-mapreduce-examples-2.0.0-cdh4.0.0.jar. But I am getting this error when my job is submitted successfully to jobtracker . Looks like it is trying to access local file-system again (Although I have set all the required libraries for job execution in distributed cache still its trying to access local dir). Is this issues related to user privileges ?
I)
Cloudera:~ # hadoop fs -ls hdfs://<MyClusterIP>:8020/
shows -
Found 8 items
drwxr-xr-x - hbase hbase 0 2012-07-04 17:58 hdfs://<MyClusterIP>:8020/hbase<br/>
drwxr-xr-x - hdfs supergroup 0 2012-07-05 16:21 hdfs://<MyClusterIP>:8020/input<br/>
drwxr-xr-x - hdfs supergroup 0 2012-07-05 16:21 hdfs://<MyClusterIP>:8020/output<br/>
drwxr-xr-x - hdfs supergroup 0 2012-07-06 16:03 hdfs:/<MyClusterIP>:8020/tools-lib<br/>
drwxr-xr-x - hdfs supergroup 0 2012-06-26 14:02 hdfs://<MyClusterIP>:8020/test<br/>
drwxrwxrwt - hdfs supergroup 0 2012-06-12 16:13 hdfs://<MyClusterIP>:8020/tmp<br/>
drwxr-xr-x - hdfs supergroup 0 2012-07-06 15:58 hdfs://<MyClusterIP>:8020/user<br/>
II)
--- No Result for following ----
hdfs@Cloudera:/etc/hadoop/conf> find . -name '**' | xargs grep "default.name"<br/>
hdfs@Cloudera:/etc/hbase/conf> find . -name '**' | xargs grep "default.name"<br/>
Instead I think with new APIS we are using ->
fs.defaultFS --> hdfs://Cloudera:8020 which i have set properly
Although for "fs.default.name" I got entries for hadoop cluster 0.20.2 (non-cloudera cluster)
cass-hadoop@Pratapgad:~/hadoop/conf> find . -name '**' | xargs grep "default.name"<br/>
./core-default.xml: <name>fs.default.name</name><br/>
./core-site.xml: <name>fs.default.name</name><br/>
I think cdh4 default configuration should add this entry in respective directory. (If its bug).
The command I am using to run my progrmme -
hdfs@Cloudera:/home/cloudera/yogesh/lib> java -classpath hbase-tools.jar:hbase.jar:slf4j-log4j12-1.6.1.jar:slf4j-api-1.6.1.jar:protobuf-java-2.4.0a.jar:hadoop-common-2.0.0-cdh4.0.0.jar:hadoop-hdfs-2.0.0-cdh4.0.0.jar:hadoop-mapreduce-client-common-2.0.0-cdh4.0.0.jar:hadoop-mapreduce-client-core-2.0.0-cdh4.0.0.jar:log4j-1.2.16.jar:commons-logging-1.0.4.jar:commons-lang-2.5.jar:commons-lang3-3.1.jar:commons-cli-1.2.jar:commons-configuration-1.6.jar:guava-11.0.2.jar:google-collect-1.0-rc2.jar:google-collect-1.0-rc1.jar:hadoop-auth-2.0.0-cdh4.0.0.jar:hadoop-auth.jar:jackson.jar:avro-1.5.4.jar:hadoop-yarn-common-2.0.0-cdh4.0.0.jar:hadoop-yarn-api-2.0.0-cdh4.0.0.jar:hadoop-yarn-server-common-2.0.0-cdh4.0.0.jar:commons-httpclient-3.0.1.jar:commons-io-1.4.jar:zookeeper-3.3.2.jar:jdom.jar:joda-time-1.5.2.jar com.hbase.xyz.MyClassName