151
votes

How to copy file from HDFS to the local file system . There is no physical location of a file under the file , not even directory . how can i moved them to my local for further validations.i am tried through winscp .

9

9 Answers

261
votes
  1. bin/hadoop fs -get /hdfs/source/path /localfs/destination/path
  2. bin/hadoop fs -copyToLocal /hdfs/source/path /localfs/destination/path
  3. Point your web browser to HDFS WEBUI(namenode_machine:50070), browse to the file you intend to copy, scroll down the page and click on download the file.
33
votes

In Hadoop 2.0,

hdfs dfs -copyToLocal <hdfs_input_file_path> <output_path>

where,

  • hdfs_input_file_path maybe obtained from http://<<name_node_ip>>:50070/explorer.html

  • output_path is the local path of the file, where the file is to be copied to.

  • you may also use get in place of copyToLocal.

18
votes

In order to copy files from HDFS to the local file system the following command could be run:

hadoop dfs -copyToLocal <input> <output>

  • <input>: the HDFS directory path (e.g /mydata) that you want to copy
  • <output>: the destination directory path (e.g. ~/Documents)

Update: Hadoop is deprecated in Hadoop 3

use hdfs dfs -copyToLocal <input> <output>

6
votes

you can accomplish in both these ways.

1.hadoop fs -get <HDFS file path> <Local system directory path>
2.hadoop fs -copyToLocal <HDFS file path> <Local system directory path>

Ex:

My files are located in /sourcedata/mydata.txt I want to copy file to Local file system in this path /user/ravi/mydata

hadoop fs -get /sourcedata/mydata.txt /user/ravi/mydata/
5
votes

If your source "file" is split up among multiple files (maybe as the result of map-reduce) that live in the same directory tree, you can copy that to a local file with:

hadoop fs -getmerge /hdfs/source/dir_root/ local/destination
3
votes

This worked for me on my VM instance of Ubuntu.

hdfs dfs -copyToLocal [hadoop directory] [local directory]

1
votes

1.- Remember the name you gave to the file and instead of using hdfs dfs -put. Use 'get' instead. See below.

$hdfs dfs -get /output-fileFolderName-In-hdfs

0
votes

if you are using docker you have to do the following steps:

  1. copy the file from hdfs to namenode (hadoop fs -get output/part-r-00000 /out_text). "/out_text" will be stored on the namenode.

  2. copy the file from namenode to local disk by (docker cp namenode:/out_text output.txt)

  3. output.txt will be there on your current working directory

-3
votes
bin/hadoop fs -put /localfs/destination/path /hdfs/source/path