When creating a data pipeline via API / CLI that creates an EmrCluster, I can specify multiple steps using an array structure:
{ "objects" : [
{ "id" : "myEmrCluster",
"terminateAfter" : "1 hours",
"schedule" : {"ref":"theSchedule"}
"step" : ["some.jar,-param1,val1", "someOther.jar,-foo,bar"] },
{ "id" : "theSchedule", "period":"1 days" }
] }
I can call put-pipeline-definition
referencing the file above to create a number of steps for the EMR cluster.
Now if I want to create the pipeline using CloudFormation, I can use the PipelineObjects
property in a AWS::DataPipeline::Pipeline
resource type to configure the pipeline. However, pipeline objects can only be of type StringValue
or RefValue
. How can i create an array pipeline object field?
Here's a corresponding cloudformation template:
"Resources" : {
"MyEMRCluster" : {
"Type" : "AWS::DataPipeline::Pipeline",
"Properties" : {
"Name" : "MyETLJobs",
"Activate" : "true",
"PipelineObjects" : [
{
"Id" : "myEmrCluster",
"Fields" : [
{ "Key" : "terminateAfter","StringValue":"1 hours" },
{ "Key" : "schedule","RefValue" : "theSchedule" },
{ "Key" : "step","StringValue" : "some.jar,-param1,val1" }
]
},
{
"Id" : "theSchedule",
"Fields" : [
{ "Key" : "period","StringValue":"1 days" }
]
}
]
}
}
}
With the above template, step
is a StringValue
, equivalent to:
"step" : "some.jar,-param1,val1"
and not an array like the desired config.
http://docs.aws.amazon.com/AWSCloudFormation/latest/UserGuide/aws-properties-datapipeline-pipeline-pipelineobjects-fields.html shows only StringValue
and RefValue
are valid keys - is it possible to create an array of steps via CloudFormation??
Thanks in advance.