3
votes

Google has recently made great progress with their speech recognition software, which is used in several open source products, e.g. Chromium Web Speech and Android Handsfree texting. I would like to use their speech recognition as part of my server stack, however I can't find much about it.

Is the text recognition software available as a library or package? Or alternatively, can I call chromium from another program to transcribe some audio file to text?

2
I think these answers may be outdated, Google has started to making some parts public early 2013.Jeroen Ooms
got a link? It would be helpful.Michael Levy
E.g. bgr.com/2013/01/14/google-chrome-speech-recognition-api-291569 and dvcs.w3.org/hg/speech-api/raw-file/tip/…. But this is about interfacing in Chrome, I can't find it as a standalone library.Jeroen Ooms

2 Answers

1
votes

The Web Speech API's are designed only to be used in the context of either Chrome or Android. There is a lot of work that goes on in the client so there is no public server to server API that would just take an audio file and process it.

If you search github you find tools such as https://gist.github.com/alotaiba/1730160 but I am pretty certain that this method of access is 100% not supported, endorsed or confirmed to keep working.

1
votes

The method previously stated at https://gist.github.com/alotaiba/1730160 does work for me. I use it on a daily basis in my home automation programs. I use a python script to capture audio and determine what is useful audio or just noise, then it sends the little audio snippet to google and returns the text all under a second!! I have successfully integrated it into my programs and if you google around you will find even more people that have as well!