Google BigQuery - query ran successfully but results not pushed to destination table

Question

We run a nightly query against BigQuery via the Java REST API that specifies a destination table for the results to be pushed to (write disposition=WRITE_TRUNCATE). Today's query appeared to run without errors but the results were not pushed to the destination table.

This query has been running for a few weeks now and we've had no issues. No code changes were made either.

Manually running it a second time after it "failed" worked fine. It was just this one glitch that we spotted and we're concerned it may happen again.

Our logged JSON response from the "failed" query looks fine (I've obfuscated any sensitive data):

INFO: Job finished successfully: {
"configuration" : {
"dryRun" : false,
"query" : {
  "createDisposition" : "CREATE_IF_NEEDED",
  "destinationTable" : {
    "datasetId" : "[REMOVED]",
    "projectId" : "[REMOVED]",
    "tableId"   : "[REMOVED]"
  },
  "priority" : "INTERACTIVE",
  "query" : "[REMOVED]",
  "writeDisposition" : "WRITE_TRUNCATE"
 }
},
"etag" : "[REMOVED]",
"id" : "[REMOVED]",
"jobReference" : {
 "jobId" : "[REMOVED]",
 "projectId" : "[REMOVED]"
},
"kind" : "bigquery#job",
"selfLink" : "[REMOVED]",
"statistics" : {
 "creationTime" : "1390435780070",
 "endTime" : "1390435780769",
 "query" : {
  "cacheHit" : false,
  "totalBytesProcessed" : "12546"
},
 "startTime" : "1390435780245",
 "totalBytesProcessed" : "12546"
 },
 "status" : {
 "state" : "DONE"
 }
}

Using the "try it!" for Jobs/GET here and plugging in the job id also shows the job was indeed successful and matches our logged output (pasted above).

Checking the web console shows the destination table has been truncated but not updated. Weirdly, the "Last Modified" has not been updated (I did try refreshing the page numerous times):

http://i.stack.imgur.com/384NL.png

Has anyone experienced this before with BigQuery - a query appearing to run successfully but if a destination/reference table was specified the results were not pushed yet the table was truncated?

Jordan Tigani Jordan Tigani · Accepted Answer · 2014-01-23T05:12:03

I am a developer on the BigQuery team. I've looked up the details of you job from the breadcrumbs you left (your query was the only one that started at that start time).

It looks like your destination table was truncated at 4:09 pm today PST, which is the time your job ran, but it was left empty -- the query that truncated it didn't actually fill in any information.

I'm having a little bit of trouble piecing together the details, because one of the source tables appears to have been overwritten (the left table in your left outer of join was created at 4:20 PM).

However, there is a clue in the "total bytes processed" field -- it says that the query only processed 12K of data. The internal statistics say that only 384 rows were involved in the query among both tables that were involved.

My guess is that the query legitimately returned 0 rows, so the table was cleared.

There is a bug in that deleting all of the data in a table doesn't update the last modified time. We use last modified to mean either ast time the metadata was updated (like description, schema, etc) or the last time the table had data added to it). But if you just truncate the table, that doesn't update the metadata or add data, so we end up with a stale last-modified time.

If this doesn't sound like a reasonable chain of events, we'll need more information from you about how to debug it (especially since it looks like the tables involved have been modified since you ran this query), and a way that we can reproduce it would be great.

Google BigQuery - query ran successfully but results not pushed to destination table

2 Answers