1
votes

Context

I'm fairly new to the Google Cloud Platform and I'm trying out Google Dataflow. I read a CSV file and simulate streaming data. CSV row publishes into topic (PubSub) and Dataflow reads it and inserts the data into a BigQuery table.

Problem

When my file contains types STRING, FLOAT, INTEGER my process finished successfully and data is loaded to BigQuery. But if I add to my process one of those types: DATETIME, TIME, DATE it always fails. There are many code examples but I didn't find code example that show how to handle it.

Data Examples

2017-01-23 - load it into DATE type

14:10:12 - load it into TIME type

I hope you guys can help me with this ...

1
Please provide more details about your code and a complete printout of the failure. Merely knowing that the pipeline fails is insufficient to help you with debugging.jkff
We use timestamp as date format in BQ. While adding data to BQ from dataflow we are putting date as String with specific timestamp format yyyy-MM-dd hh:MM:ss.SSS. This is working perfectly for us in all pipes we build. Hope this will helpJack
So, did you found which POJO to send to BigQuery for Date and DateTime formats ?vdolez

1 Answers

1
votes

I tried few things:

//1
SimpleDateFormat sdf = new SimpleDateFormat("yyyy-MM-dd");
  String dateInString = info.getEventDate();
  Date date = sdf.parse(dateInString);

row.set("EventDate", date);


//2
row.set("EventDate", new DateTime("2017-01-23")); 


//3
public String getEventDate() {
return get(Field.EventDate);  }
...
...
...
row.set("EventDate", info.getEventDate());


//4
private static DateTimeFormatter dmt = DateTimeFormat.forPattern("yyyy-MM-dd");
....
....
....
DateTime ds = dmt.parseDateTime(info.getEventDate());