1
votes

I'm trying to use the .NET SpeechRecognitionEngine with C# in VisualStudio Express. However i'm finding that it's picking up completely wrong words / sentences and assuming they are something in the grammar.

EG If I load "test 1" into the grammar and say "filthy beast" which is not even close to the words "test 1", the EventHandler SpeechRecognized fires. I left a movie playing on netflix while coding and it was firing the recognized event to music and talk in the movie, so it's way way off.

Is there a way to prevent it from assuming the spoken words are in the grammar? Or any way to stop this?

Any tips?

Here is a log output for me saying "filthy beast" when the grammar only has "test 1" loaded into it.

speechDetectedHandler():
speechHypothesizedHandler():  confidence = 0.002903746    e.Result.Text = Test
speechHypothesizedHandler():  confidence = 0.8096436    e.Result.Text = Test
speechRecognizedHandler():  confidence = 0.7723699    e.Result.Text = Test 1

Code:

public SpeechRecognitionEngine sre;

String culture = "en-US";
foreach (RecognizerInfo config in SpeechRecognitionEngine.InstalledRecognizers())
{
    if (config.Culture.ToString() == culture)
    {
      s = new SpeechRecognitionEngine(config);
      break;
    }
}
s.SetInputToDefaultAudioDevice();

sre.MaxAlternates = 0;

sre.AudioLevelUpdated += new EventHandler<AudioLevelUpdatedEventArgs>(audioLevelHandler);
sre.SpeechRecognized += new EventHandler<SpeechRecognizedEventArgs>(speechRecognizedHandler);

sre.SpeechHypothesized += new EventHandler<SpeechHypothesizedEventArgs>(speechHypothesizedHandler);
sre.SpeechDetected += new EventHandler<SpeechDetectedEventArgs>(speechDetectedHandler);


gb = new GrammarBuilder(speechCommands);
g = new Grammar(gb);

sre.UnloadAllGrammars();
sre.LoadGrammar(g);
startListening();
2
I have edited your title. Please see, "Should questions include “tags” in their titles?", where the consensus is "no, they should not".John Saunders
Good job you were here to correct it then ;)Darcey

2 Answers

1
votes

Solution is to create and load grammar that is similar to the word / grammar / speech you want to use, this will increase accuracy. Then to evaluate hypothesized trigger 1, trigger 2 and then recognized confidence levels and result text. Not very practical as this would be different for each person / user.

There is no way to prevent the .NET Speech Recognition Engine from ALWAYS RETURNING A GRAMMAR MATCH. You may as well be saying "bob" in a silent room into a studio grade mic and it would recognize "open windows media player". lol

Warning 1: grammar word lists of over 1,000 slow things down and can lock the application.

Warning 2: en-US has good english recognition capabilities, switching to en-GB etc lowers accuracy drastically

So far with Googles Speech Recognition API (does require you to be online) but it is 10x more accurate and you can easily test for a match yourself.

-1
votes

You can use Wildcard grammar element to accept other words without forcing elements from the grammar. You can add Wildcard to choices together with your commands.

If you want to recognize commands in presence of other speech, this solution might not be easy to tune. A specialized keyword spotting solution looking for a keyphrase like "ok google" might have more sense then. Microsoft speech engine has no API for that, but there are external libraries.