1
votes

I am working on Google Cloud Speech to Text API using node js client. I found the project here https://github.com/googleapis/nodejs-speech and tried out the samples from it. Everything was working ok but i could not find a sample for alternativeLanguageCodes. I found that it is supported in version V1P1Beta1 as mentioned here: https://cloud.google.com/speech-to-text/docs/reference/rest/v1p1beta1/RecognitionConfig , if we proivde alternativeLanguageCodes, api will try to transcribe the audio to the most relevant language. What i have observed is it always transcribes to the language specified in languageCode only.

Did anyone get a chance to try this API? If so, can you explain how you were able to detect alternativeLanguage.

1
Can you provide a bit more information about the language and alternativelanguageCode you are using? What is your actual audio language? Also have you noticed any difference if changing the order on the alternativelanguageCode, or if adding some different alternativelanguageCode? - VictorGGl
Hi, Here is the code snippet from nodejs-speech samples function const request = { config: { encoding: encoding, sampleRateHertz: sampleRateHertz, languageCode: 'en-US', alternativeLanguageCodes: ['hi-IN'] }, interimResults: false, }; - user2092512
Continuation to my previous comment... I have added alternateLanguageCodes to the existing request object. This request object is passed to streamingRecognize. I tried with different alternateLanguageCodes but none of them worked. StreamingRecognize always tried to convert the input audio using languageCode even if the language in the audio was different. - user2092512
Is it possible for you to share here some small audio file(s) that you are using? - VictorGGl
I am using mic as the audio input. I used an example from github.com/googleapis/nodejs-speech/blob/master/samples, I modified recognize.js streamingMicRecognize function impl, request variable is modified to contain alternativeLanguageCodes: [languageCodes.language1, languageCodes.language2] and model: 'command_and_search'. And i also modified const speech = require('@google-cloud/speech'); to be const speech = require('@google-cloud/speech').v1p1beta1; I hope you will be able to substitute these value and try it out. I am not able to paste modified function due to charters limit. - user2092512

1 Answers

0
votes

Using the code below it works for me, although it's true that it doesn't detect the right language. Take into account that this feature is still in Beta. Anyway see that in official docs it states i.e:

... feature is ideal ... to transcribe short statements like voice commands or search.

With this particular audio using in my code (which is in English and says ""how old is the Brooklyn Bridge"), running it several times, sometimes it returned the right transcription and sometimes "How old is a bre kod braća". This behavior may vary depending on the languages provided, the audio sample...

const speech = require('@google-cloud/speech').v1p1beta1;

var client = new speech.SpeechClient();

var languageCode = 'sr-SR';
var alternativeLanguageCodes = [`es-ES`,`en-US`];
var model = 'command_and_search';
const config = {
  alternativeLanguageCodes:alternativeLanguageCodes,
  model:model,
  languageCode: languageCode,
};
var uri = 'gs://cloud-samples-tests/speech/brooklyn.flac';
const audio = {
  uri: uri,
};
const request = {
  config: config,
  audio: audio,
};


client.recognize(request).then(data => {const response = data[0]; const transcription = response.results.map(result => result.alternatives[0].transcript).join('\n');console.log(`Transcription: `, transcription); }).catch(err => {console.error('Error:',err);});