I need to read multiple parquet files in Apache Beam, all files are into the same folder. I've tried to read it using a wild card sign *.
I've managed to read separated parquet files using ParquetIO and this is the snippet how I read one parquet file:
pipeline.apply(ParquetIO.read(SCHEMA).from(filePath + File.separator + "*"));
where the filePath is for example /path/xxx.parquet.
The snippet of the code how I've tried to read multiple parquet files is
pipeline.apply(ParquetIO.read(SCHEMA).from(folderPath + File.separator + "*.parquet" + File.separator + "*"));
where the folder path is for example /path/to/parquet/files/
I also tried without the last part File.separator + "*", but it's the same result. The info I got is:
FileIO:654 - Matched 0 files for pattern /path/to/parquet/files/*.parquet/ *
Also, I can have various numbers and names of parquet files.
Is it possible to read multiple parquet files using Apache Beam, because I found the way to read multiple txt files?