9
votes

In OSX Mavericks, speech dictation is now included, and is very useful. I am trying to use the dictation capability to create my own digital life assistant, but I can't find how to use the recognition functionality to get the speech in an application rather than a text box.

I have looked into NSSpeechRecognizer, but that seems to be geared toward programming speakable commands with a pre-defined grammar rather than dictation. It doesn't matter what programming language I use, but Python or Java would be nice...

Thanks for your help!

1
any solutions since?Nicolas Manzini
@NicolasManzini yes see my answer.Franck Dernoncourt

1 Answers

4
votes

You can use SFSpeechRecognizer (mirror) (requires macOS 10.15+): this is made for speech recognition.

Perform speech recognition on live or prerecorded audio, receive transcriptions, alternative interpretations, and confidence levels of the results.

Whereas as you have noted in the question NSSpeechRecognizer (mirror) indeed provides a “command and control” style of voice recognition system (the command phrases must be defined prior to listening, in contrast to a dictation system where the recognized text is unconstrained).

From https://developer.apple.com/videos/play/wwdc2019/256/ (mirror):

enter image description here

Another way is to directly use Mac Dictation, but as far as I know the only way is to rerdirect audio feeds, which isn't very neat, e.g. see http://www.showcasemarketing.com/ideablog/transcribe-mp3-audio-to-text-mac-os/ (mirror).