0
votes

I am new to speech recognition, android and i have a use case where i need to build an android app which takes commands(limited set of commands, less than 100) from users and executes some logic. I have googled a bit and found the following can be done

  1. Use google cloud speech api
  2. Use Android inbuilt speech to text capability (Is it different from google cloud speech api? If so how?). Also what are the pros and cons of using offline mode of android speech to text?
  3. Use open source speech recognition libraries like Kaldi, CMU Sphinx(it looked like they need a lot of effort in collecting and training the data)

Can someone please suggest me which of the above might best suit my use case? I have a limited set of commands and speed matters the most to me.

I am really confused and thus putting this question. Thanks in advance.

1

1 Answers

1
votes

Use google cloud speech api

Very expensive since you have to pay for every request.

Use Android inbuilt speech to text capability (Is it different from google cloud speech api? If so how?). Also what are the pros and cons of using offline mode of android speech to text?

The inbuilt API is ok to use. It is different from cloud API and it is free. It does not work offline transparently for the user though. Bad side it is slow and you can not configure the vocabulary. So it will decode all words instead of some particular set of commands and often will confuse the required commands with other words in noise.

Use open source speech recognition libraries like Kaldi, CMU Sphinx(it looked like they need a lot of effort in collecting and training the data)

Proper development is always an effort.