Could you please give us some more information... like which language and library version you are using for this part of your project?
Assuming you are using Python, you could find another official way for connecting to Google Cloud Speech to Text Api here: https://cloud.google.com/speech-to-text/docs/basics
The way I am used to do is by using googleapiclient
phyton package alongside with JSON data structure instead of dictionary data.
import base64
import googleapiclient.discovery
with open(speech_file, 'rb') as speech:
# Base64 encode the binary audio file for inclusion in the JSON
# request.
speech_content = base64.b64encode(speech.read())
# Construct the request
service = googleapiclient.discovery.build('speech', 'v1')
service_request = service.speech().recognize(
body={
"config": {
"encoding": "LINEAR16", # raw 16-bit signed LE samples
"sampleRateHertz": 16000, # 16 khz
"languageCode": "en-US", # a BCP-47 language tag
},
"audio": {
"content": speech_content
}
})
Refer to this official article if you don't know how to install python packages: https://packaging.python.org/tutorials/installing-packages/#id13
For LongRunning requests, please refer to:
https://cloud.google.com/speech-to-text/docs/reference/rest/v1/speech/longrunningrecognize
The config JSON structure in this case will be:
{
"config": {
object(RecognitionConfig)
},
"audio": {
object(RecognitionAudio)
}
}
Where RecognitionConfig is a JSON object of the kind:
{
"encoding": enum(AudioEncoding),
"sampleRateHertz": number,
"languageCode": string,
"maxAlternatives": number,
"profanityFilter": boolean,
"speechContexts": [
{
object(SpeechContext)
}
],
"enableWordTimeOffsets": boolean
}
And RecognitionAudio is of the kind:
{
// Union field audio_source can be only one of the following:
"content": string,
"uri": string
// End of list of possible types for union field audio_source.
}
For LongRunning recognition, you may also refer to this link:
https://developers.google.com/resources/api-libraries/documentation/speech/v1/java/latest/com/google/api/services/speech/v1/Speech.SpeechOperations.html
It shows how to use the Phyton package googleapiclient.discovery
for long running requests, which is just by using the following method in your Phyton class:
...
service_request = service.speech().longrunningrecognize(
body= {
"config": {
"encoding": "FLAC",
"languageCode": "en-US",
"enableWordTimeOffsets": True
},
"audio": {
"uri": str('gs://speech-clips/'+self.audio_fqid)
}
}
)
...