0
votes

I am trying with installing CDH4 using tarball version , but facing issues as in steps taken by me are as below :

i downloaded tarball from link https://ccp.cloudera.com/display/SUPPORT/CDH4+Downloadable+Tarballs

i first untar the hadoop-0.20-mapreduce-0.20.2+1341 tar file

i did with configuration changes in

hadoop-0.20-mapreduce-0.20.2+1341  since i wanted mrv1 not yarn .

the first thing as per mentioned in cdh4 installation was to configure HDFS

i made the relevant changes in

core-site.xml
hdfs-site.xml
mapred-site.xml
masters --- which is my namenode
slaves ---- my datanodes

copied the hadoop configurations on all the nodes in the cluster

did a namenode format .

after format i had to start the cluster , but in the bin folder could not

find start-all.sh script . so in that case i started with command

bin/start-mapred.sh

in the logs it shows jobtracker started and tasktracker started on slave nodes but when i do a jps

i can see only

jobtracker
jps

further going did a datanode start on the datanode with below command

bin/hadoop-daemon.sh start datanode .

it shows datanode started .

Namenode not getting started , tasktracker not getting started .

when i checked with my logs i could see

ERROR org.apache.hadoop.hdfs.server.namenode.NameNode: Exception in namenode join
java.io.FileNotFoundException: webapps/hdfs not found in CLASSPATH

not sure what is stopping my cluster to work .

earlier i had a cdh3 running . so i stopped the cdh3 cluster . Then i started with installing cdh4 . Also i changed all the directories hdfs-site.xml i.e. pointed it new empty directories for namenode and datanode and not the used the ones defined in cdh3.

but still nothing seems to help .

Also i turned off firewall since i do have a root access but same thing it did not work for me .

Any help on above will be great help.


thank you for kind reply but

I do not have

start-dfs.sh file in bin folder 

only files in /home/hadoop-2.0.0-mr1-cdh4.2.0/bin folder are as

start-mapred.sh
stop-mapred.sh
hadoop-daemon.sh
hadoop-daemons.sh
hadoop-config.sh
rcc
slaves.sh
hadoop

command now i am using are as below

for starting datanode :

for x in /home/hadoop-2.0.0-mr1-cdh4.2.0/bin/hadoop-* ; do $x start datanode ; done ;

for starting namenode :

bin/start-mapred.sh

still i am working on the same issue .

2

2 Answers

1
votes

Hi sorry for the above misunderstanding the following commands can be run to start your datanodes and namenode

To start namenode:

hadoop-daemon.sh  start namenode 

To start datanode:

hadoop-daemons.sh  start datanode 

To start secondarynamenode:

hadoop-daemons.sh --hosts masters start secondarynamenode
-1
votes

The jobtracker demon will get started in your master node and tasktraker demons will get started in each of your datanodes after you run the command

bin/start-mapred.sh

In Hadoop Cluster Setup only jobtacker demon will be show by JPS command in masternode and in each of your datanodes you can see Tasktracker demons runnig by using JPS command.

Then you have to start HDFS by running the following command in your masternode

bin/start-dfs.sh

This command will start namenode demon in you namenode machine (in this configuration your masternode itself I believe) and Datanode demons are started in each of your slave nodes.

Now you can run JPS on each of your datanodes and it will give output

tasktracker
datanode
jps

I think this link will be usefull http://www.michael-noll.com/tutorials/running-hadoop-on-ubuntu-linux-multi-node-cluster/