For example, I have a list of URLs as strings which are stored in Datastore. So, I used the DatastoreIO function and read them into a PCollection. In ParDo’s DoFn, for each URL (which is a GCP cloud storage location of a file), I have to read the file present in that location and do further transformations.
So I want to know if I can write ParDo for PCollections inside a ParDo function. Kind of parallel execution of each file transformation and send KV (key, PCollection) something as output of the first ParDo function.
Sorry, if I haven't presented my scenario clearly. I'm a newbie to Apache Beam & Google Dataflow