0
votes

For the Google Cloud Speech API, is there a way to pass a word or a complete sentence and find out whether that word or sentence is the same as what's in the audio file with a confidence level score? For example, can I pass an audio file and the word "cheese" and get a response that tells me whether the audio file says "cheese" and what the API's confidence level is that it says cheese? (Same idea for passing a sentence and an audio file.)

I know I can pass helpful words or phrases, but those, as I understand it, just help Google determine what the transcription should be; it doesn't tell me whether the audio matches the text passed (I don't think).

If Google doesn't do this, are there any other speech APIs that do?

Thanks!

1

1 Answers

0
votes

That's not really a base functionality but more of an applied use case. In another way, there's no builtin way to do it, but you can easily use the code to build the functionality.

There is a confidence level in Google Cloud Speech API's recognize function. You can simply compare your target word with the recognized word(s) and then if there's a match, use the confidence level provided for that recognized word.

If the API thinks there's 92% chase the word is "Cheese", it will return "Cheese", 0.92 confidence. So your code will find the match and say there's a 92% chance the word spoken in the file was "Cheese".