we have a Cloudera 5 installation based on one single node on a single server. Before adding 2 additional nodes on the cluster, we want to increase the size of the partition using a fresh new disk.
We have the following services installed:
- yarn with 1 NodeManager 1 JobHistory and 1 ResourceManager
- hdfs with 1 datanode 1 primary node and 1 secondary node
- hbase with 1 master and 1 regionserver
- zookeeper with 1 server
All data is currently installed on a partition. The number of data that will be collected has increased so we need to use another disk where store all the information.
All the data are under a partition mounted into the folder /dfs
The working partition is:
df -h
hadoop-dfs-partition 119G 9.8G 103G 9% /dfs
df -i
hadoop-dfs-partition 7872512 18098 7854414 1% /dfs
the content of this folder is the following:
drwxr-xr-x 11 root root 4096 May 8 2014 dfs
drwx------. 2 root root 16384 May 7 2014 lost+found
drwxr-xr-x 5 root root 4096 May 8 2014 yarn
under dfs there are these folders:
drwx------ 3 hdfs hadoop 4096 Feb 23 18:14 dn
drwx------ 3 hdfs hadoop 4096 Feb 23 18:14 dn1
drwx------ 3 hdfs hadoop 4096 Feb 23 18:14 dn2
drwx------ 3 hdfs hadoop 4096 Feb 23 18:14 nn
drwx------ 3 hdfs hadoop 4096 Feb 23 18:14 nn1
drwx------ 3 hdfs hadoop 4096 Feb 23 18:14 nn2
drwx------ 3 hdfs hadoop 4096 Feb 23 18:14 snn
drwx------ 3 hdfs hadoop 4096 Feb 23 18:14 snn1
drwx------ 3 hdfs hadoop 4096 Feb 23 18:14 snn2
under yarn there are these folders:
drwxr-xr-x 9 yarn hadoop 4096 Nov 9 15:46 nm
drwxr-xr-x 9 yarn hadoop 4096 Nov 9 15:46 nm1
drwxr-xr-x 9 yarn hadoop 4096 Nov 9 15:46 nm2
How can we achieve this? I found only ways to migrate data beetween clusters with distcp command.
Didn't find any way to move raw data.
Stopping all services and shutting down the entire cluster before performing a
cp -Rp /dfs/* /dfs-new/
command is a viable option?
(/dfs-new in the folder where the fresh new ext4 partition of the new disk is mounted)
Any better way of doing this?
Thank you in advance