I want to load data from hundreds of CSV files on Google cloud Storage and append them to a single table on Bigquery on a daily basis using cloud dataflow (preferable using python SDK). Can you please let me know how I Can accomplish that?
Thanks
I want to load data from hundreds of CSV files on Google cloud Storage and append them to a single table on Bigquery on a daily basis using cloud dataflow (preferable using python SDK). Can you please let me know how I Can accomplish that?
Thanks
We can do it through Python as well. Please find the below code snippet.
def format_output_json(element):
"""
:param element: is the row data in the csv
:return: a dictionary with key as column name and value as real data in a row of the csv.
:row_indices: I have hard-coded here, but can get it at the run time.
"""
row_indices = ['time_stamp', 'product_name', 'units_sold', 'retail_price']
row_data = element.split(',')
dict1 = dict()
for i in range(len(row_data)):
dict1[row_indices[i]] = row_data[i]
return [dict1]