1
votes

I have a use case to copy the latest generated HDFS file to a linux remote server. I do not want to store intermediate in local file system and then do scp to a remote server.

I am aware of this, but I want to AVOID it (for the obvious reason - having overhead of storing huge file in local fs)

hadoop fs -copyToLocal <src> <dest>
and then scp toMyLinuxFileSystem

Is there a command to directly copy hdfs file to remote linux server?

1
Why not run the Hadoop command from that server? ssh user@host 'hadoop fs -copyToLocal ...' - OneCricketeer

1 Answers

1
votes

You can stream the data using linux pipes if ssh to server is available

hdfs dfs -cat my-file.txt | ssh myserver.com 'cat > /path/to/my-file.txt'

First command reads the file data as a stream, second one redirects it to remote server. Worked for me. Take into account that ssh sometimes timeout if there's nothing on the wire for a long time.

Credits to this answer: https://superuser.com/questions/291829