2
votes

I'm looking at using Watson's Speech to Text software to help drive voice commands for our product.

All the examples I've seen require the user to press a button before giving a command. However, rather than having the user push a button, I'd like a "wake word" or keyword to signal the beginning of a command to our product. That is, I don't want to continuously stream sound to Watson's Speech To Text software, but I'm looking for a way to have a user give a keyword or wake word to start sending sound and then let Watson's Speech To Text return the text of the command it heard.

For example, "OK, Google" starts sending sound to Google for speech to text.

Does IBM provide a way to create my own "OK, Google" keyword without having to send everything my application may hear to Watson's Speech to Text?

2
Stephen, where is this app going to run? Is it on some dedicated embedded product, or is this running on some existing platform (like a phone, tablet, or something similar)? - Daniel Toczala
Existing platform. Likely either a tablet (as a wall mounted kiosk) or on a general purpose Windows PC (as part of our Java application). - Stephen M -on strike-

2 Answers

2
votes

Right now the Watson Speech to Text service does not support a separate "wake word" detection module. To do this, our current customers will use some edge device or service to handle that. Something like Snowboy (https://snowboy.kitt.ai/) or something similar.

0
votes

Not sure if Watson supports wake-up word. If you plan to integrate voice in your software application that runs either on PC, tablet or phone, you can implement Wake-up word. You could achieve it either using Microsoft Speech recognition Engine or Sphinx.

Here is a sample code using Microsoft speech engine.

SpeechRecognitionEngine sr = new SpeechRecognitionEngine();
 sr.SetInputToDefaultAudioDevice();
 // Create a grammar that recognizes Wake-up word e.g. your app name
 Choices wakeWord = new Choices();
 wakeWord.Add("Cooper");

 GrammarBuilder gb = new GrammarBuilder();
 gb.Append(wakeWord);
 Grammar g = new Grammar(gb);

 sr.LoadGrammarAsync(g);
 sr.SpeechRecognized += new EventHandler<SpeechRecognizedEventArgs>(sr_SpeechRecognized);

your application will get activated after the wake-up word "Cooper" is uttered. In the event handler code you can then capture/record sound and send it to Watson.