0
votes

i need an advice with processing batch jobs on newly uploaded table(using php and csv).

Currently the process that I have running every week is to upload 2 tables (one is ~400000 rows and ~24mb data, second is ~7000 rows and ~627 kb data) and then schedule queries with batch priority to process uploaded data from those tables and save result into new table.

When I'm trying to run batch queries during daytime it usually takes me to run with some substantial delay, like 20mins or so. The problem is that during upload procedure bigquery runs them almost immediately, thus throwing "Table not found error" and skipping some of them.

recent upload: Upload table 1 "job_75ae1fa6db89418b8fe2b6c443501246" Upload table 2 "job_a79c39ae528944848fab85650b94a5d7" One of the batch job number to show recent error is "job_dd18580ccb51486dabf82d1d408a3199"

Question is - is this behavior correct for the batch jobs? and is there a way to predict/schedule their execution time or i just need to separate them and run at a different time?

1

1 Answers

0
votes

You're explicitly not given many guarantees about when batch jobs will happen. I would take that seriously. You can, though, use a get to find out when it did happen.

The point of batch jobs is that they can be run on machines that would otherwise be idle. Nobody knows in advance what the availability of such machines will be. If this is a problem for you, don't schedule batch jobs.