Speech Recognition Engine Breaking Down a Command

Question

I am writing a WPF application that needs to recognize spoken commands from the user. Being new to the Speech Recognition Engine, I am unsure of how I can accomplish what I need to do the best way possible. The flow of the application would be as follows:

User speaks a keyword to 'awaken' the app (ex Amazon Echo requiring the user to say 'Alexa')
User speaks the command for the app to perform (ex "Play 'some song by some artist'")

My issue is that I am unsure what to do with my program after the keyword is recognized. If I were to play the song the user says after the keyword is spoken, would I need to start a new speech recognizer? This is some psuedo code of what I am doing:

    private SpeechRecognitionEngine _listen;

    public frmHome()
    {
        InitializeComponent();
        SetupListen();
    }

    private void SetupListen()
    {
        ResetListener();
    }

    private void ResetListener()
    {
        _listen = new SpeechRecognitionEngine();

        Choices exChoices = new Choices();

        exChoices.Add(new String[] { "keyword" });

        GrammarBuilder gb = new GrammarBuilder();
        gb.Append(exChoices);

        Grammar g = new Grammar(gb);

        _listen.LoadGrammar(g);
        _listen.SpeechRecognized += new EventHandler<SpeechRecognizedEventArgs>(sr_speechRecognized);
        _listen.SetInputToDefaultAudioDevice();
        _listen.RecognizeAsync();
    }

    private void sr_speechRecognized(object sender, SpeechRecognizedEventArgs e)
    {
        if (e.Result.Text.Equals("keyword"))
        {
            //start listening for the command
        }

        ResetListener();
    }

Priyank Priyank · Accepted Answer · 2017-07-27T06:15:32

You can have a separate grammar for your start/stop keywords, and have a flag variable which is set to true when the start command is spoken.

Then in your SpeechRecognized handler, you can check the flag and then go on to search the recognized text for the command text.

If you are looking for a small keyword, such as Alexa, you could simply search the recognized text for the keyword before handling the uttered command. Searching the contents of a string.

I hope this article helps: https://msdn.microsoft.com/en-us/magazine/dn857362.aspx

Speech Recognition Engine Breaking Down a Command

2 Answers