2
votes

I am using Google Speech API to transcript long files. The API is called from Google Cloud Functions. I want to check the result of longRunningRecognize later with operations.get. I know the name/id of the operation, but i cannot find a good way to check the status of operation from Google Cloud Function by the operation name.

Of course, i can just make a GET HTTP request to this url:

https://speech.googleapis.com/v1/operations/{name}?key=API_KEY 

This is an example code that works:

const functions = require('firebase-functions');
const speech = require('@google-cloud/speech');
const request = require('request');

exports.transcribe = functions.storage.object().onFinalize((object) => {
  // some code to get data required for speech API
  const payload = {
    audio: {
      uri: 'some_uri/to/google/storage/file'
    },
    config: {
      encoding: 'FLAC',
      languageCode: 'en-US'
    }
  };

  const client = new speech.SpeechClient({
    projectId: 'my-project-id'
  });

  client.longRunningRecognize(payload)
    .then(responses => {
      const operation = responses[0];
      // current example of getting operation status by operation name with HTTP call
      request(`https://speech.googleapis.com/v1/operations/${operation.latestResponse.name}?key=MY-API-KEY`, (error, response, body) => {
        console.log('Operation status response: ', body);
      });
    });
});

But it seems like there should be a more clear way of doing this. At least I can find this ruby way of getting operation status and this description of OperationsClient, so i want something like this to check the status:

// this line is the most confusing part of the puzzle
const client = longrunning.operationsClient();
const name = '';
client.getOperation({name: name}).then(function(responses) {
  var response = responses[0];
  // doThingsWith(response)
});

Thanks for any help!

2

2 Answers

3
votes

I suspect you've moved on, but I'll answer for the sake of others.

When you run the longRunningRecognize method, the SDK starts polling longrunning operations.get for you. You just need to set up a node event listener using .on.

The Operation object (first array element of the promise returned from longRunningRecognize) emits node events on progress, complete and error.

An update of OP's code:

client.longRunningRecognize(payload)
  .then(responses => {
    const operation = responses[0];
    operation.on('progress', (metadata, apiResponse) => {
      console.log(JSON.stringify(metadata))
    });
  });

Example output: (same as https://speech.googleapis.com/v1/operations/...)

{"startTime":{"seconds":"1529629181","nanos":790333000},"lastUpdateTime":{"seconds":"1529629182","nanos":661910000}}
{"progressPercent":26,"startTime":{"seconds":"1529629181","nanos":790333000},"lastUpdateTime":{"seconds":"1529629245","nanos":48465000}}
{"progressPercent":52,"startTime":{"seconds":"1529629181","nanos":790333000},"lastUpdateTime":{"seconds":"1529629307","nanos":516891000}}
{"progressPercent":78,"startTime":{"seconds":"1529629181","nanos":790333000},"lastUpdateTime":{"seconds":"1529629369","nanos":680341000}}

Note that there is no apparent way to get the partially transcribed text with an async operation, only the status percentage.

0
votes

I just ran into a similar problem.

Here's the code you're looking for to check the status of a longRunningRecognize with an operation name:

const client = new speech.SpeechClient();
const operationName = '...';
client.checkLongRunningRecognizeProgress(operationName).then(res => {
  if(res.done){
    var response = res.result.responses[0];
    // doThingsWith(response)
  }
});