0
votes

For various reasons, I have one hadoop installation on machine A, a second hadoop installation on cluster B, and a third hadoop installation on cluster C.

When I set up machine A, the xml files were set so that I could use the HDFS shell to find the HDFS on machine A.

I can rewrite the xml files on machine A so that the HDFS shell invoked from machine A sees a different HDFS by default.

However, I would like to be able to access all filesystems conveniently, without resetting the xml files.

Example: while logged in at machine A, I would like to copy a file from cluster B to cluster C with syntax something like:

hdfs dfs -cp hdfs://nn1.exampleB.com/file1 hdfs://nn2.exampleC.com/file2

Currently it seems that syntax does not work (although the errors are varied; sometimes they are EOF; other times they are network timeouts).

Should the above syntax be valid without modifications to the XML configuration files?

1

1 Answers

0
votes

You should be using distcp command: $ hadoop distcp hdfs://nn1:8020/foo/bar hdfs://nn2:8020/bar/foo

See more here: http://hadoop.apache.org/docs/r0.19.0/distcp.html