How to use SAPI Speech Recognition C# to control a game

Question

I've been playing quakelive.com and have been getting frustrated by my keyboard bindings and want to bind using voice commands instead.

I thought I'd create a c# console app to run in the background and use the inbuilt speech recognition engine of SAPI for windows 7 64 bit to do all the heavy speech stuff. My program would listen for SpeechRecognized events and respond accordingly. However I'm not sure how to run my console app in the background in-conjuction with ms speech recognition whilst I'm playing the game?

This is what I have written so far:

using System;
using System.Collections.Generic;
using System.Diagnostics;
using System.Linq;
using System.Speech.Recognition;
using System.Text;
using System.Threading.Tasks;
using AutoItX3Lib;

namespace ConsoleApplication1
{
    class Program
    {
        static void Main(string[] args)
        {
            AutoItX3 autoit = new AutoItX3();

            // Create a default dictation grammar.
            DictationGrammar defaultDictationGrammar = new DictationGrammar();
            defaultDictationGrammar.Name = "default dictation";
            defaultDictationGrammar.Enabled = true;

            // Create our process
            autoit.Run("notepad.exe", "", autoit.SW_MAXIMIZE);
            autoit.WinWaitActive("Unbenannt - Editor");
            Console.WriteLine("its active");

            SpeechRecognizer sr = new SpeechRecognizer();
            sr.SpeechRecognized += (s, e) =>
            {
                foreach (RecognizedWordUnit word in e.Result.Words)
                {
                    Console.WriteLine(word.Text);
                    if (word.Text.Trim().ToLower() == "one")
                        autoit.Send(word.Text.ToLower() + "{LCTRL}+{LSHIFT}+a", 0);
                    else
                        autoit.Send(word.Text.ToLower() + " ", 0);
                }
            };
            sr.LoadGrammar(defaultDictationGrammar);
        }
    }
}

Basically I'd like ms speech recognition to be running while my game is running and for my console app to listen for specific words I say. As you can see in my example code, I am listening for the phrase "one" to which I send to notepad using autoIt the text as well as some control characters to select all the text when it's been written.

So far it's not working. It seems like my console app has to have "focus" or be the foreground app and even then when I say words like "one" or "two" ms speech recognition tries to do "console comand" stuff with my app rather than just pass dictation text to it. For example when I say the word "one" it keeps saying "moving" because I think it thinks that console aren't documents so it must be a command and not a dictation.

Can anyone see what I am doing wrong and how to get this working as I want?

The final solution was to send control characters to the running "chrome.exe" process rather than "notepad" because quakelive is run in the browser. So I presume sending keyboard commands via autoIt would be enough for the chrome process to then pass those onto the quakelive plugin as game keyboard game inputs (i.e. keyboard input/keystrokes).

Anyone with any help or advice appreciated.

Michael Levy Michael Levy · Accepted Answer · 2012-10-12T15:09:46

When you create a SpeechRecognizer, you are creating a shared recognizer that uses the Windows Desktop recogntion. When you say '(it) tries to do "console comand" stuff', I suspect this is because you are using the shared recongizer which is intended for controlling applications from the desktop. If you want to use speech dedicated for your application, create a SpeechRecognitionEngine instead. The shared recongizer may work for what you want, but I think you'll need to have a dedicated grammar for it to properly control your application.

Since you are offering your user a limited set of voice commands, you'll have better success if you provide a grammar that supports this vocabulary rather than using the dictation grammar.

I don't know if the console app requires being in the foreground to capture the sound card. I suspect once you change to the inproc recognizer, the app will continue to function even while in the background.

For more background, see http://msdn.microsoft.com/en-us/magazine/cc163663.aspx. It is probably the best introductory article I’ve found so far. It is a little out of date, but very helfpul. (The AppendResultKeyValue method was dropped after the beta.) and http://msdn.microsoft.com/en-us/library/hh361625.aspx is a good place to start.

How to use SAPI Speech Recognition C# to control a game

1 Answers