I am trying to install Hadoop 1.2.1 on a (test) cluster of 5 machines with one node serving as JobTracker, NameNode and Secondary NameNode. Rest of the 4 machines are slaves.
There are two issues.
1) In the master's conf/masters and conf/slaves files, I provided the IP addresses of master and slaves respectively. On the slaves, masters file is empty and slaves file contains its own IP.
When starting up hadoop (bin/start-all.sh), TaskTracker and DataNode don't start. I put in the host names of these machines in /etc/hosts file and tried putting in their hostnames in masters and slaves files as well. This doesn't make any difference -- TaskTracker and DataNode don't start.
While starting up hadoop services, I get a message that TaskTracker and DataNode logs have been written. But strangely, I don't find them in that location. Following are the messages I get starting datanode, logging to /home/ubuntu/hadoop-1.2.1/libexec/../logs/hadoop-ubuntu-datanode-dsparq-instance4.out starting tasktracker, logging to /home/ubuntu/hadoop-1.2.1/libexec/../logs/hadoop-ubuntu-tasktracker-dsparq-instance2.out
2) In the JobTracker/NameNode log, following exception is listed multiple times.
error: java.io.IOException: File <> could only be replicated to 0 nodes, instead of 1
The solutions to these problems (on StackOverflow) suggest reformatting the hdfs and checking the entries of /etc/hosts. I tried both of them, but that didn't help.
Please let me know how to fix these errors. Thank you in advance.
Adding contents of core-site.xml and mapred-site.xml (same on all the machines)
<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>