0
votes

Hello I am trying to copy a file in my S3 bucket into HDFS using the cp command. I do something like Hadoop --config config fs -cp s3a://path hadooppath This works well when my config is in my local. However now I am trying to set it up as an oozie job. So when I am now unable to pass the configuration files present in config directory in my local system. Even if its in HDFS, then still it doesn't seem to work. Any suggestions ?

I tried -D command in Hadoop and passed name and value pairs, still it throws some error. It works only from my local system.

1
Welcome to Stack Overflow! Welcome to Stackoverflow! Can you please elaborate your question having your effort like code or something so that people could get your problem early and help you? Thanks!Enamul Hassan

1 Answers

0
votes

Did you Try DISTCP in oozie? Hadoop 2.7.2 will supports S3 data source. You can able to schedule it by coordinators. Just parse the credentials to coordinators either RESTAPI or in Properties files. Its easy way to copy a data periodically(Scheduled manner).

${HADOOP_HOME}/bin/hadoop distcp s3://<source>/ hdfs://<destination>/