I'm using below command to launch a cluster.
./elastic-mapreduce --create \
--stream \
--cache s3n://bucket_name/code/totalInstallUsers#totalInstallUsers \
--input s3n://bucket_name/input \
--output s3n://bucket_name/output \
--mapper s3n://bucket_name/code/mapper.py \
--reducer s3n://bucket_name \
--jobflow-role EMR_EC2_DefaultRole \
--service-role EMR_DefaultRole \
--debug \
--log-uri s3n://bucket_name/logs
and I always got below error message. If I remove the --cache statement, the cluster will be launched successfully.
Error: undefined method each' for #<String:0x00000002c28ba0>
/home/ubuntu/data_processing/commands.rb:806:insteps'
/home/ubuntu/data_processing/commands.rb:1232:in block in enact'
/home/ubuntu/data_processing/commands.rb:1232:inmap'
/home/ubuntu/data_processing/commands.rb:1232:in enact'
/home/ubuntu/data_processing/commands.rb:49:inblock in enact'
/home/ubuntu/data_processing/commands.rb:49:in each'
/home/ubuntu/data_processing/commands.rb:49:inenact'
/home/ubuntu/data_processing/commands.rb:2422:in create_and_execute_commands'
/home/ubuntu/data_processing/elastic-mapreduce-cli.rb:13:in'
/usr/lib/ruby/1.9.1/rubygems/custom_require.rb:36:in require'
/usr/lib/ruby/1.9.1/rubygems/custom_require.rb:36:inrequire'
./elastic-mapreduce:6:in `'
Reason to use --cache is that I wish from mapper.py I can open the datafile via "with open('./totalInstallUsers', 'r') as infile:
could anyone give me a clue? thanks