2
votes

I'm a newbie in GCP. While I'm reading the document of google speech api, it says that "Asynchronous Recognition (REST and gRPC) sends audio data to the Speech API and initiates a Long Running Operation. Using this operation, you can periodically poll for recognition results." But what does "a Long Running Operation" actually means? And what's the difference between the process of synchronous & asynchronous recognition? I've searched on the internet and found an answer about this: https://www.quora.com/What-is-the-difference-between-synchronous-and-asynchronous-speech-recognition But I still can't get the idea. Can anyone explain more specifically? I'll very appreciate for your answer:)

1

1 Answers

1
votes
  • Asynchronous cloud requests usually return an id that request has been en-queued for processing, and later you can use that id to check on status and retrieve results when done.
  • Synchronous requests return results as part of response, but they may block for longer amounts of time.

You can use gcloud command line tool to try both. Sync requests for audio less than 60 sec

gcloud ml speech recognize AUDIO_FILE ...

and async for audio longer that is longer that 60sec

gcloud ml speech recognize-long-running AUDIO_FILE ...

the latter instead of transcript will return OPERATION_ID which later you can run

gcloud ml speech operations describe OPERATION_ID

to obtain results.

TIP: You can add --log-http flag to see what API requests gcloud is making to get more insight into what is going on at api level.