After I loading parquet file from Google storage into BigQuery table. The data on preview tap (inside Bigquery) different data from source originally. But the schema it's correct.
1 Answers
0
votes
I would think that if the schema is correct, the loaded data must be correct. My best guessing is that the data in the parquet file is masked and you would need a function to unmask it.
To verify if the parquet contains the same data loaded to BQ, you can list a couple of rows in the original parquet file by running the parquet tools:
$ hadoop jar parquet-tools-1.9.0.jar head file:///ea4b68c5d20bbc90-bfec9bfd00000000_333529865_data.0.parq
select
from the table and not only use the preview tab to check the values – Tamir Kleinmaster-tangent-240211.Demo_2019.Demo_parquet
LIMIT 1000). Please help me. Thanks – Nurma Sbl