I am trying to improve the recognition accuracy of pocketsphinx in noisy environments. However the user might use the app in a variable environment. Hence training with noise is not something that I want to do.
My question is , would noise reduction before feeding in the speech signal to pocketsphinx necessarily reduce recognition accuracy?
If yes, what features of speech need to be retained after noise reduction? Currently I observe that the WER goes up from ~40%(free form language) to ~60% if I use noise reduction.
Just to add, the speech does sound better perceptually after noise reduction.
Pocketsphinx argfile:
-lm lm_giga_64k_vp_3gram.DMP
-dict lm_giga_64k_vp.sphinx.dic
-hmm voxforge_en_sphinx.cd_cont_5000
The idea here is to demonstrate increase in speech recognition accuracy with noise reduction enabled and intuitively this should ideally happen unless the noise reduction algorithm is completely messing up the spectral content of the signal.
Any help would be appreciated.