1
votes

Context

I'm fairly new to the Google Cloud Platform and I'm trying out Google Dataflow. I read a CSV file and simulate streaming data. CSV row publishes into topic (PubSub) and Dataflow reads it and inserts the data into a BigQuery table.

Problem

When my file contains types STRING, FLOAT, INTEGER my process finished successfully and data is loaded to BigQuery. But if I add to my process one of those types: DATETIME, TIME, DATE it always fails. There are many code examples but I didn't find code example that show how to handle it.

Data Examples

2017-01-23 - load it into DATE type

14:10:12 - load it into TIME type

I hope you guys can help me with this ...

1
Please provide more details about your code and a complete printout of the failure. Merely knowing that the pipeline fails is insufficient to help you with debugging. - jkff
We use timestamp as date format in BQ. While adding data to BQ from dataflow we are putting date as String with specific timestamp format yyyy-MM-dd hh:MM:ss.SSS. This is working perfectly for us in all pipes we build. Hope this will help - Jack
So, did you found which POJO to send to BigQuery for Date and DateTime formats ? - vdolez

1 Answers

1
votes

I tried few things:

//1
SimpleDateFormat sdf = new SimpleDateFormat("yyyy-MM-dd");
  String dateInString = info.getEventDate();
  Date date = sdf.parse(dateInString);

row.set("EventDate", date);


//2
row.set("EventDate", new DateTime("2017-01-23")); 


//3
public String getEventDate() {
return get(Field.EventDate);  }
...
...
...
row.set("EventDate", info.getEventDate());


//4
private static DateTimeFormatter dmt = DateTimeFormat.forPattern("yyyy-MM-dd");
....
....
....
DateTime ds = dmt.parseDateTime(info.getEventDate());