0
votes

I am using Google Speech API for transcribing an audio file using the following Python script https://github.com/GoogleCloudPlatform/python-docs-samples/blob/master/speech/cloud-client/transcribe_async.py and the following command :

python transcribe_async.py 1503489730.193982.flac

and the response I get is this one :

Waiting for operation to complete...
Traceback (most recent call last):
  File "transcribe_async.py", line 102, in <module>
    transcribe_file(args.path)
  File "transcribe_async.py", line 52, in transcribe_file
    response = operation.result(timeout=200)
  File "/home/toto/anaconda3/lib/python3.5/site-packages/google/gax/__init__.py", line 596, in result
    raise GaxError(self._operation.error.message)
google.gax.errors.GaxError

I can't figure out what the error is. I might have configured the audio parameters wrong, I really don't know.

Thanks

1

1 Answers

4
votes

Linear16 is the only accepted format for async. Uncompressed 16-bit signed little-endian samples (Linear PCM). This is the only encoding that may be used by AsyncRecognize. See documentation.

You can convert an mp3 to raw like this:

sox async.mp3 -t raw --channels=1 --bits=16 --rate=16000 --encoding=signed-integer --endian=little async.raw