0
votes

I'm using Sphinx-4 to convert voice to text, but I need that the application recognizes a grammar and then a sequence of words dictated.

For example, having the following grammar:

public <greet> = (Good morning | Hello);

If I say "Hello" and then Joan (or any other name) I intend to return the text "Hello Joan"

I saw the topic Dictation Application using Sphinx4 but if I change the settings will always return <unk>. This is the right step? If yes, what am I doing wrong?

1
There is no use for the grammar here, you can just use sphinx4 in dictation mode and parse the decoded string. For example of easy sphinx4 setup see the tutorial cmusphinx.sourceforge.net/wiki/tutorialsphinx4 - Nikolay Shmyrev
I know, but I need to use a grammar for my work. - user3267555
You can not do this thing because it has no meaning. You can just recognize using dictation language model, it will give you all that you need. If you want to improve results you can create specific language model for your domain as cmusphinx tutorial suggests. - Nikolay Shmyrev
So there is no chance of using a grammar and then by dictation? I'm asking this because it is very important to define commands without using parser. - user3267555
You can not mix grammar and dictation in one utterance. - Nikolay Shmyrev

1 Answers

1
votes

To get output as Hello Joan you must keep your grammar like

public <greet> = (Good morning | Hello) (JOAN | JOHN | MIKE);

So it can return you -Good Morning JOAN -Good Morning JOHN -Good Morning MIKE -Hello JOAN -Hello JOHN -Hello MIKE

in this grammar if you also intended to get Just Good Morning or just Hello

then your grammar should be

public <greet> = (Good morning | Hello) (JOAN | JOHN | MIKE)*;
  • here specifies that 0 or more occurrence of JOAN/JOHN/MIKE SO it can also return Hello JOHN MIKE or Hello or Good Morning and all possible combinations.