0
votes

I am getting a Bigquery internal error when attempting to perform a full outer join. The query succeeds if I use an inner join.

The job statistics can be seen in:

GET https://www.googleapis.com/bigquery/v2/projects/quantum-gearbox-90015/jobs/job_AQtc0zMEkDHkk-OmCXCGRk6zycs?key={YOUR_API_KEY}

The query is as follows:

SELECT CASE WHEN t1.userId IS NOT NULL THEN t1.userId ELSE t2.userId END AS userId, CASE WHEN t1.jobId IS NOT NULL THEN t1.jobId ELSE t2.jobId END AS jobId, CASE WHEN t1.result_duration IS NOT NULL THEN t1.result_duration ELSE t2.result_duration END AS result_duration, CASE WHEN t1.job_submitted_ts IS NOT NULL THEN t1.job_submitted_ts ELSE t2.job_submitted_ts END AS job_submitted_ts, CASE WHEN t1.job_running_ts IS NOT NULL THEN t1.job_running_ts ELSE t2.job_running_ts END AS job_running_ts, CASE WHEN t1.job_complete_ts IS NOT NULL THEN t1.job_complete_ts ELSE t2.job_complete_ts END AS job_complete_ts, CASE WHEN t1.userId IS NULL AND t2.userId IS NOT NULL THEN t2.last_updated_ts WHEN t1.jobId IS NULL AND t2.jobId IS NOT NULL THEN t2.last_updated_ts WHEN t1.result_duration IS NULL AND t2.result_duration IS NOT NULL THEN t2.last_updated_ts WHEN t1.job_submitted_ts IS NULL AND t2.job_submitted_ts IS NOT NULL THEN t2.last_updated_ts WHEN t1.job_running_ts IS NULL AND t2.job_running_ts IS NOT NULL THEN t2.last_updated_ts WHEN t1.job_complete_ts IS NULL AND t2.job_complete_ts IS NOT NULL THEN t2.last_updated_ts ELSE t1.last_updated_ts END AS last_updated_ts FROM ( SELECT userId, jobId, result_duration, job_submitted_ts, job_running_ts, job_complete_ts, last_updated_ts FROM [activeJobs.jobScoreboard] )t1 FULL OUTER JOIN EACH ( SELECT userId, jobId, result_duration, job_submitted_ts, job_running_ts, job_complete_ts, INTEGER(ROUND(NOW()/1000,0)) AS last_updated_ts FROM ( SELECT userId, jobId, MAX(CASE WHEN eventType = 'WORKITEM_COMPLETE' THEN data.result.data.duration END) AS result_duration, MIN(CASE WHEN eventType = 'JOB_SUBMITTED' THEN timestamp END) AS job_submitted_ts, MIN(CASE WHEN eventType = 'JOB_RUNNING' THEN timestamp END) AS job_running_ts, MAX(CASE WHEN eventType = 'JOB_COMPLETE' THEN timestamp END) AS job_complete_ts, FROM [trebekProd.JobActivity_20151125T000000Z_20151126T000000Z@1448472935436-] GROUP BY userId, jobId))t2 ON t1.jobId = t2.jobId

1
The job query returns in part the following response: "status": { "state": "DONE", "errorResult": { "reason": "internalError", "message": "An internal error occurred and the request could not be completed." }, "errors": [ { "reason": "internalError", "message": "An internal error occurred and the request could not be completed." } ] }, "statistics": { "creationTime": "1448477868181", "startTime": "1448477868667", "endTime": "1448477872929" }, "user_email": "gquinones@brightcove.com" }user1849860
Can you please give me your jobid so I can debug more efficiently? (Whoops. Sorry. Just found it in your link above...)Dan Delorey

1 Answers

0
votes

The root cause of your error was a temporary overload of our streaming ingestion service. It was unable to serve the data for your table at the time you ran your query because of the aggregate load on the system. We will investigate this further and fix the underlying issue. For now, if you run your query again it should succeed.