2
votes

we have a Cloudera 5 installation based on one single node on a single server. Before adding 2 additional nodes on the cluster, we want to increase the size of the partition using a fresh new disk.

We have the following services installed:

  • yarn with 1 NodeManager 1 JobHistory and 1 ResourceManager
  • hdfs with 1 datanode 1 primary node and 1 secondary node
  • hbase with 1 master and 1 regionserver
  • zookeeper with 1 server

All data is currently installed on a partition. The number of data that will be collected has increased so we need to use another disk where store all the information.

All the data are under a partition mounted into the folder /dfs

The working partition is:

df -h

hadoop-dfs-partition 119G 9.8G 103G 9% /dfs

df -i

hadoop-dfs-partition 7872512 18098 7854414 1% /dfs

the content of this folder is the following:

drwxr-xr-x 11 root root 4096 May 8 2014 dfs
drwx------. 2 root root 16384 May 7 2014 lost+found
drwxr-xr-x 5 root root 4096 May 8 2014 yarn

under dfs there are these folders:

drwx------ 3 hdfs hadoop 4096 Feb 23 18:14 dn
drwx------ 3 hdfs hadoop 4096 Feb 23 18:14 dn1
drwx------ 3 hdfs hadoop 4096 Feb 23 18:14 dn2
drwx------ 3 hdfs hadoop 4096 Feb 23 18:14 nn
drwx------ 3 hdfs hadoop 4096 Feb 23 18:14 nn1
drwx------ 3 hdfs hadoop 4096 Feb 23 18:14 nn2
drwx------ 3 hdfs hadoop 4096 Feb 23 18:14 snn
drwx------ 3 hdfs hadoop 4096 Feb 23 18:14 snn1
drwx------ 3 hdfs hadoop 4096 Feb 23 18:14 snn2

under yarn there are these folders:

drwxr-xr-x 9 yarn hadoop 4096 Nov 9 15:46 nm
drwxr-xr-x 9 yarn hadoop 4096 Nov 9 15:46 nm1
drwxr-xr-x 9 yarn hadoop 4096 Nov 9 15:46 nm2

How can we achieve this? I found only ways to migrate data beetween clusters with distcp command.

Didn't find any way to move raw data.

Stopping all services and shutting down the entire cluster before performing a

cp -Rp /dfs/* /dfs-new/

command is a viable option?

(/dfs-new in the folder where the fresh new ext4 partition of the new disk is mounted)

Any better way of doing this?

Thank you in advance

1

1 Answers

1
votes

i've resolved in this way:

  1. stop all services but hdfs

  2. export data out of the hdfs. In my case the interesting part was in hbase:

    su - hdfs  
    hdfs dfs -ls / 
    

    command show me the following data:
    drwxr-xr-x - hbase hbase 0 2015-02-26 20:40 /hbase
    drwxr-xr-x - hdfs supergroup 0 2015-02-26 19:58 /tmp
    drwxr-xr-x - hdfs supergroup 0 2015-02-26 19:38 /user

    hdfs dfs -copyToLocal / /a_backup_folder/  
    

    to export all data from hdfs to a normal file system

    control-D  
    

    to return root

    stop ALL services on Cloudera (hdfs included)

  3. now you can umount the "old" and "new" partition.

  4. mount the "new" partition in place of the path of the "old" one (in my case is /dfs)

  5. mount the "old" partition in a new place in my case is /dfs-old (remember to mkdir /dfs-old) in this way can check the old structure

  6. make this change permanent editing /etc/fstab. Check if everything is correct repeating step 3 and after try a

    mount -a 
    
  7. df -h to check if you have /dfs and /dfs-old mapped on the proper partitions (the "new" and the "old" one respectively)

  8. format namenode going into

    services > hdfs > namenode > action format namenode
    in my case doing

    ls -l /dfs/dfs  
    

    i have:
    drwx------ 4 hdfs hadoop 4096 Feb 26 20:39 nn
    drwx------ 4 hdfs hadoop 4096 Feb 26 20:39 nn1
    drwx------ 4 hdfs hadoop 4096 Feb 26 20:39 nn2

  9. start hdfs service on cloudera

    you should have new folders:

    ls -l /dfs/dfs  
    

    i have:

    drwx------ 3 hdfs hadoop 4096 Feb 26 20:39 dn
    drwx------ 3 hdfs hadoop 4096 Feb 26 20:39 dn1
    drwx------ 3 hdfs hadoop 4096 Feb 26 20:39 dn2
    drwx------ 4 hdfs hadoop 4096 Feb 26 20:39 nn
    drwx------ 4 hdfs hadoop 4096 Feb 26 20:39 nn1
    drwx------ 4 hdfs hadoop 4096 Feb 26 20:39 nn2
    drwx------ 3 hdfs hadoop 4096 Feb 26 20:39 snn
    drwx------ 3 hdfs hadoop 4096 Feb 26 20:39 snn1
    drwx------ 3 hdfs hadoop 4096 Feb 26 20:39 snn2

  10. now copy back data into the new partition

    hdfs dfs -copyFromLocal /a_backup_folder/user/* /user  
    hdfs dfs -copyFromLocal /a_backup_folder/tmp/* /tmp  
    hdfs dfs -copyFromLocal /a_backup_folder/hbase/* /hbase  
    
  11. The hbase folder need to have the proper permission, hbase:hbase as user:group

    hdfs dfs -chown -R hbase:hbase /hbase  
    

    if you forgot this step you get permission denied error on the hbase log file later

    check the result with

    hdfs dfs -ls /hbase
    

    you should see something like this:
    drwxr-xr-x - hbase hbase 0 2015-02-26 20:40 /hbase/.tmp
    drwxr-xr-x - hbase hbase 0 2015-02-26 20:40 /hbase/WALs
    drwxr-xr-x - hbase hbase 0 2015-02-27 11:38 /hbase/archive
    drwxr-xr-x - hbase hbase 0 2015-02-25 15:18 /hbase/corrupt
    drwxr-xr-x - hbase hbase 0 2015-02-25 15:18 /hbase/data
    -rw-r--r-- 3 hbase hbase 42 2015-02-25 15:18 /hbase/hbase.id
    -rw-r--r-- 3 hbase hbase 7 2015-02-25 15:18 /hbase/hbase.version
    drwxr-xr-x - hbase hbase 0 2015-02-27 11:42 /hbase/oldWALs

(the important part here is to have the proper user and group of file and folders)

now start all services and check if hbase is working with

    hbase shell  
    list

you should see all the tables you had before migration. Try with

    count 'a_table_name'