We've created a pipeline, which is performing a transformation from 3 streams located in GCS ('Clicks', 'Impressions', 'ActiveViews'). We have the requirement that we need to write the individual streams back out to GCS, but to separate files (to be later loaded into BigQuery), because they all have slightly a different schema.
One of the writes has failed twice in succession with different errors each time, which turn causes the pipeline to fail.
These are the last 2 workflow/pipeline represented visually from the GDC, which show the failure:
The 1st error:
Feb 21, 2015, 12:55:14 PM (b0cbc05dfc56dbd9): Workflow failed. Causes: (f98c177c56055863): Map task completion for Step "ActiveViews-GSC-write" failed. Causes: (2d838e694976dc6): Expansion failed for filepattern: gs://cdf/binaries/tmp-38156614004ed90e-[0-9][0-9][0-9][0-9][0-9]-of-[0-9][0-9][0-9][0-9][0-9].avro.
The 2nd error:
Feb 21, 2015, 1:20:15 PM (19dcdcf1fe125eeb): Workflow failed. Causes: (2a27345ef73673d3): Map task completion for Step "ActiveViews-GSC-write" failed. Causes: (8f79a20dfa5c4d2b): Unable to view metadata for file: gs://cdf/binaries/tmp-2a27345ef7367fe6-00001-of-00015.avro.
It's only happening on the "ActiveViews-GCS-Write" step.
Any idea what we're doing wrong?