1
votes

My Function is called 15 times per hour (every 4 minutes), and runs a query via startQuery. An "Invalid Credentials" error happens randomly after about 30 minutes. It happens more and more frequently until all calls fail.

This query reads data from a table in a dataset, and saves the result to a table located in another dataset, via the options destination and writeDisposition=WRITE_TRUNCATE. The two datasets are located in EU.

Redeploying the function removes the problem temporarily.

A call to gcloud beta functions describe my-function indicates that it uses the App Engine default service account: [email protected].

Here are the error details:

ApiError: Invalid Credentials
at Object.parseHttpRespBody (/user_code/node_modules/@google-cloud/bigquery/node_modules/@google-cloud/common/src/util.js:192:30)
at Object.handleResp (/user_code/node_modules/@google-cloud/bigquery/node_modules/@google-cloud/common/src/util.js:132:18)
at /user_code/node_modules/@google-cloud/bigquery/node_modules/@google-cloud/common/src/util.js:465:12
at Request.onResponse [as _callback] (/user_code/node_modules/@google-cloud/bigquery/node_modules/retry-request/index.js:160:7)
at Request.self.callback (/user_code/node_modules/@google-cloud/bigquery/node_modules/request/request.js:188:22)
at emitTwo (events.js:106:13)
at Request.emit (events.js:191:7)
at Request.<anonymous> (/user_code/node_modules/@google-cloud/bigquery/node_modules/request/request.js:1171:10)
at emitOne (events.js:96:13)
at Request.emit (events.js:188:7)

Edit

The code, stripped:

const bigquery = require('@google-cloud/bigquery')();

const destinationDataset = bigquery.dataset('destinationDataset');
const destinationTable = dataset.table('destinationTable');

exports.aggregate = function aggregate (event) {
  const message = event.data;
  const attributes = message.attributes;

  let job;
  let destinationTable;
  return Promise.resolve()
    .then(() => {
      if (attributes.partition == null) {
        throw new Error('Partition time not provided. Make sure you have a "partitionTime" attribute in your message');
      }

      const query = 'SELECT ... FROM sourceTable WHERE _PARTITIONTIME = TIMESTAMP(@partitionTime)'; // The dataset is specified in the job options below

      // Query options list: https://cloud.google.com/bigquery/docs/reference/v2/jobs/query
      // and: https://googlecloudplatform.github.io/google-cloud-node/#/docs/bigquery/0.9.6/bigquery?method=startQuery
      const options = {
        destination: destinationTable,
        writeDisposition: 'WRITE_TRUNCATE',
        query: query,
        defaultDataset: { datasetId: 'sourceDataset' },
        timeoutMs: 540000, // 9 minutes, same timeout as the Cloud Function
        useLegacySql: false,
        parameterMode: 'NAMED',
        queryParameters: [{
          name: 'partitionTime',
          parameterType: { type: 'STRING' },
          parameterValue: { value: attributes.partition }
        }]
      };

      return bigquery.startQuery(options);
    })
    .then((results) => {
      job = results[0];
      console.log(`BigQuery job ${job.id} started, generating data in ${destinationTable.dataset.id}.${destinationTable.id}.`);
      return job.promise();
    })
    .then((results) => {
      // Get the job's status
      return job.getMetadata();
    })
    .then((metadata) => {
      // Check the job's status for errors
      const errors = metadata[0].status.errors;
      if (errors && errors.length > 0) {
        throw errors;
      }
    })
    // Only if a destination table is given
    .then(() => {
      console.log(`BigQuery job ${job.id} completed, data generated in ${destinationTable.dataset.id}.${destinationTable.id}.`);
    })
    .catch((err) => {
      console.log(`Job failed for ${inspect(attributes)}: ${inspect(err)}`);
      return Promise.reject(err);
    });
};

You'll notice that I give no options while initializing the bigquery object: require('@google-cloud/bigquery')().

Should I create a service account with the role BigQuery Job User, and use the RuntimeConfig API to avoid pushing the credentials to the git origin ?

The question still remains on why I get this error randomly. Looking at the function logs now, the error happened on every calls between midnight and 4am CEST, then on one third of the calls until 5:36am. And since that time (4 hours ago) it did not happen once.

Edit 2

This shows the frequency of failed invocations compared to successful ones. All the errors (in green) are "Invalid Credentials" errors. Absolutely nothing was touched during those 7 days: no deployments, no changes of configurations, no fiddlings in BigQuery.

frequency of credentials errors

1
Can you shows us a sample of your code? - Felipe Hoffa
Done. Added an Edit section. - nfo
hit same issue; I have a cloud function that got notified from a new cloud storage upload (which happens every 1 minute), and the function got invoked every 1 minute, it just call bigquery to load it into a table, this function was deployed two weeks ago running without problem till today the ApiError randomly happens on some minutes, not always; this must be an operation issue from google side - user5672998

1 Answers

1
votes

my workaround is to use a VM from GCE and run an nodejs app to wrap the Function, like

const pubsub = require('@google-cloud/pubsub')();
const topic = pubsub.topic('...');
const sub = topic.subscribe('...', { ...options });
sub.on('message', msg => {
   // the same cloud function does all bigquery data loading and querying stuff...
   callthesamecloudfunction({ data: msg });
});

depends on which kind of trigger used in GCF, the wrapper could be different;

the good side is the wrapper is simple enough, it runs in our production environment for some months already, without any problems;

my conclusion is GCloud operates many services in Beta, isn't really for production ready; I would keep away from GCF or may revisit 6 months later.

EDIT: see https://issuetracker.google.com/issues/66695033 that GCloud team claimed a fix but I don't have time to test yet; my wrapper way just cost one cloud VM and is very cheap.