3
votes

I installed Hadoop in my Ubuntu 12.04 by following the procedure in the below link.

http://www.bogotobogo.com/Hadoop/BigData_hadoop_Install_on_ubuntu_single_node_cluster.php

Everything is installed successfully and when I run the start-all.sh only some of the services are running.

wanderer@wanderer-Lenovo-IdeaPad-S510p:~$ su - hduse
Password:

hduse@wanderer-Lenovo-IdeaPad-S510p:~$ cd /usr/local/hadoop/sbin

hduse@wanderer-Lenovo-IdeaPad-S510p:/usr/local/hadoop/sbin$ start-all.sh

This script is Deprecated. Instead use start-dfs.sh and start-yarn.sh
Starting namenodes on [localhost]
hduse@localhost's password: 
localhost: starting namenode, logging to /usr/local/hadoop/logs/hadoop-hduse-namenode-wanderer-Lenovo-IdeaPad-S510p.out
hduse@localhost's password: 
localhost: starting datanode, logging to /usr/local/hadoop/logs/hadoop-hduse-datanode-wanderer-Lenovo-IdeaPad-S510p.out
Starting secondary namenodes [0.0.0.0]
[email protected]'s password: 
0.0.0.0: starting secondarynamenode, logging to /usr/local/hadoop/logs/hadoop-hduse-secondarynamenode-wanderer-Lenovo-IdeaPad-S510p.out
starting yarn daemons
starting resourcemanager, logging to /usr/local/hadoop/logs/yarn-hduse-resourcemanager-wanderer-Lenovo-IdeaPad-S510p.out
hduse@localhost's password: 
localhost: starting nodemanager, logging to /usr/local/hadoop/logs/yarn-hduse-nodemanager-wanderer-Lenovo-IdeaPad-S510p.out

hduse@wanderer-Lenovo-IdeaPad-S510p:/usr/local/hadoop/sbin$ jps
7940 Jps
7545 ResourceManager
7885 NodeManager

Once I stop the service by running the script stop-all.sh

hduse@wanderer-Lenovo-IdeaPad-S510p:/usr/local/hadoop/sbin$ stop-all.sh
This script is Deprecated. Instead use stop-dfs.sh and stop-yarn.sh
Stopping namenodes on [localhost]
hduse@localhost's password: 
localhost: no namenode to stop
hduse@localhost's password: 
localhost: no datanode to stop
Stopping secondary namenodes [0.0.0.0]
[email protected]'s password: 
0.0.0.0: no secondarynamenode to stop
stopping yarn daemons
stopping resourcemanager
hduse@localhost's password: 
localhost: stopping nodemanager
no proxyserver to stop

My configuration files

  1. Editing bashrc file

    vi ~/.bashrc
    
    #HADOOP VARIABLES START
    export JAVA_HOME=/usr/lib/jvm/java-8-oracle/
    export HADOOP_INSTALL=/usr/local/hadoop
    export PATH=$PATH:$HADOOP_INSTALL/bin
    export PATH=$PATH:$HADOOP_INSTALL/sbin
    export HADOOP_MAPRED_HOME=$HADOOP_INSTALL
    export HADOOP_COMMON_HOME=$HADOOP_INSTALL
    export HADOOP_HDFS_HOME=$HADOOP_INSTALL
    export YARN_HOME=$HADOOP_INSTALL
    export HADOOP_COMMON_LIB_NATIVE_DIR=$HADOOP_INSTALL/lib/native
    export HADOOP_OPTS="-Djava.library.path=$HADOOP_INSTALL/lib"
    export HADOOP_COMMON_LIB_NATIVE_DIR=$HADOOP_HOME/lib/native
    export HADOOP_OPTS="-Djava.library.path=$HADOOP_HOME/lib"
    #HADOOP VARIABLES END
    
  2. hdfs-site.xml

    vi /usr/local/hadoop/etc/hadoop/hdfs-site.xml
    
    <configuration>
     <property>
      <name>dfs.replication</name>
      <value>1</value>
      <description>Default block replication.
      The actual number of replications can be specified when the file is created.
      The default is used if replication is not specified in create time.
      </description>
     </property>
     <property>
       <name>dfs.namenode.name.dir</name>
       <value>file:/usr/local/hadoop_store/hdfs/namenode</value>
     </property>
     <property>
       <name>dfs.datanode.data.dir</name>
       <value>file:/usr/local/hadoop_store/hdfs/datanode</value>
     </property>
    </configuration>
    
  3. hadoop-env.sh

    vi /usr/local/hadoop/etc/hadoop/hadoop-env.sh
    
    export JAVA_HOME=/usr/lib/jvm/java-8-oracle/
    export HADOOP_CONF_DIR=${HADOOP_CONF_DIR:-"/etc/hadoop"}
    
    for f in $HADOOP_HOME/contrib/capacity-scheduler/*.jar; do
      if [ "$HADOOP_CLASSPATH" ]; then
        export HADOOP_CLASSPATH=$HADOOP_CLASSPATH:$f
      else
        export HADOOP_CLASSPATH=$f
      fi
    done
    
    export HADOOP_OPTS="$HADOOP_OPTS -Djava.net.preferIPv4Stack=true"
    export HADOOP_NAMENODE_OPTS="-Dhadoop.security.logger=${HADOOP_SECURITY_LOGGER:-INFO,RFAS} -Dhdfs.audit.logger=${HDFS_AUDIT_LOGGER:-INFO,NullAppender} $HADOOP_NAMENODE_OPTS"
    export HADOOP_DATANODE_OPTS="-Dhadoop.security.logger=ERROR,RFAS $HADOOP_DATANODE_OPTS"
    
    export HADOOP_SECONDARYNAMENODE_OPTS="-Dhadoop.security.logger=${HADOOP_SECURITY_LOGGER:-INFO,RFAS} -Dhdfs.audit.logger=${HDFS_AUDIT_LOGGER:-INFO,NullAppender} $HADOOP_SECONDARYNAMENODE_OPTS"
    
    export HADOOP_NFS3_OPTS="$HADOOP_NFS3_OPTS"
    export HADOOP_PORTMAP_OPTS="-Xmx512m $HADOOP_PORTMAP_OPTS"
    
    # The following applies to multiple commands (fs, dfs, fsck, distcp etc)
    export HADOOP_CLIENT_OPTS="-Xmx512m $HADOOP_CLIENT_OPTS"
    export HADOOP_SECURE_DN_USER=${HADOOP_SECURE_DN_USER}
    
    export HADOOP_SECURE_DN_LOG_DIR=${HADOOP_LOG_DIR}/${HADOOP_HDFS_USER}
    export HADOOP_PID_DIR=${HADOOP_PID_DIR}
    export HADOOP_SECURE_DN_PID_DIR=${HADOOP_PID_DIR}
    
    # A string representing this instance of hadoop. $USER by default.
    export HADOOP_IDENT_STRING=$USER
    
  4. core-site.xml

    vi /usr/local/hadoop/etc/hadoop/core-site.xml
    <configuration>
     <property>
      <name>hadoop.tmp.dir</name>
      <value>/app/hadoop/tmp</value>
      <description>A base for other temporary directories.</description>
     </property>
    
     <property>
      <name>fs.default.name</name>
      <value>hdfs://localhost:54310</value>
      <description>The name of the default file system.  A URI whose
      scheme and authority determine the FileSystem implementation.  The
      uri's scheme determines the config property (fs.SCHEME.impl) naming
      the FileSystem implementation class.  The uri's authority is used to
      determine the host, port, etc. for a filesystem.</description>
     </property>
    </configuration>
    
  5. mapred-site.xml

    vi /usr/local/hadoop/etc/hadoop/mapred-site.xml
    <configuration>
     <property>
      <name>mapred.job.tracker</name>
      <value>localhost:54311</value>
      <description>The host and port that the MapReduce job tracker runs
      at.  If "local", then jobs are run in-process as a single map
      and reduce task.
      </description>
     </property>
    </configuration>
    

    $ javac -version

    javac 1.8.0_66
    

    $ java -version

    java version "1.8.0_66"  
    Java(TM) SE Runtime Environment (build 1.8.0_66-b17)
    Java HotSpot(TM) 64-Bit Server VM (build 25.66-b17, mixed mode)
    

I am new to Hadoop and could not find the issue. Where can I find the log files for Jobtracker and NameNode in order to track the services?

4
I found the issue. I made a silly mistake. The actual hadoop user is hduse not hduser. I changed the ownership to hduse for /usr/local/hadoop_store/hdfs. Wow!! now it is working!!!.....Wanderer

4 Answers

3
votes

If it is not an ssh issue, do the next:

  1. Delete all contents from temporary directory: rm -Rf /app/hadoop/tmp and format the namenode server bin/hadoop namenode -format. Start the namenode and datanode with bin/start-dfs.sh. Type jps in command line to check whether nodes are running.

  2. Check if hduser has rights to write the hadoop_store/hdfs/namenode and datanode directories with ls -ld directory

    You can change the rights by sudo chmod +777 /hadoop_store/hdfs/namenode/

1
votes

if you take a closer look to start-all.sh command log, you can easily see log fileş path. Each service after try starting write into logs

localhost: starting namenode, logging to /usr/local/hadoop/logs/hadoop-hduse-namenode-wanderer-Lenovo-IdeaPad-S510p.out
ocalhost: starting datanode, logging to /usr/local/hadoop/logs/hadoop-hduse-datanode-wanderer-Lenovo-IdeaPad-S510p.out
0.0.0.0: starting secondarynamenode, logging to /usr/local/hadoop/logs/hadoop-hduse-secondarynamenode-wanderer-Lenovo-IdeaPad-S510p.out
starting resourcemanager, logging to /usr/local/hadoop/logs/yarn-hduse-resourcemanager-wanderer-Lenovo-IdeaPad-S510p.out
localhost: starting nodemanager, logging to /usr/local/hadoop/logs/yarn-hduse-nodemanager-wanderer-Lenovo-IdeaPad-S510p.out
0
votes

You have to set up passwordless authentication for ssh. The hduse user should be able to login to localhost over ssh without password.

0
votes

The namenode is not showing

After inserting the $jps command, the namenode is not showing but, datanode is created. So, to solve the problems, we can follow the steps which are given below,

It will work for the configuration with hadoop 2.7.6

Step 1:::(Stop hadoop)

/usr/local/hadoop/sbin$ stop-dfs.sh

Step 2:::(Remove tmp folder)

/usr/local/hadoop/sbin$ sudo rm -rf /app/hadoop/tmp/

Step 3:::(Create new tmp file)

/usr/local/hadoop/sbin$ sudo mkdir -p /app/hadoop/tmp

/usr/local/hadoop/sbin$ sudo chown hduser:hadoop /app/hadoop/tmp

/usr/local/hadoop/sbin$ chmod 750 /app/hadoop/tmp

Step 4:::(Format namenode)

/usr/local/hadoop/sbin$ hdfs namenode -format

Step 5:::(Start dfs)

/usr/local/hadoop/sbin$ start-all.sh

/usr/local/hadoop/sbin$ jps

The namenode is now showing