Trying to use Google DataFlow Java SDK but for my usecases my input files are .parquet files.
Couldn't find any out-of-the-box functionality to read parquet into DataFlow pipeline as Bounded data source. As i understand i can create a coder and/or sink a bit like AvroIO based on Parquet Reader.
Did anyone can advise how the best way to implement it? or point me to a reference with How-to \ examples?
Appreciate your help!
--A