I have few hundred file represent 400Gb of data with CSV format with specificicatons belows
- enclosed : double quote
- separator : comma
- escape character : antislash
my data could be
a,30,"product, A","my product : \"good product\""
I think BQ evaluated data as
col 1: a col 2 : 30 col 3 : product col 4 : A col 5 : my product : "good product"
and i want
col 1: a col 2 : 30 col 3 : product, A col 4 : my product : "good product"
It's possible to load this kind of file without use dataflow or dataprep
bq load --noreplace --source_format=CSV --max_bad_records=1000000 --allow_jagged_rows ods.my_file gs://file/file.csv.gz
My data were shifted and bigquery didn't load rows
Error while reading data, error message: Could not parse 'XXX' as int for field (position 49) starting at location 2121
Data between close double quote (") and field separator.
sed -i '' 's/\"\"STRING_WITH_EXTRA_QUOTE_HERE\"/\"STRING_WITH_EXTRA_QUOTE_HERE\"/g' YOUR_FILE.csv
- Jas