In the case of the Webchat channel, you can have a look to the sources to understand how it is using speech recognition.
In particular, you can see that all speech part is made by the webchat before sending the message to the bot (sources):
const startListeningEpic: Epic<ChatActions, ChatState> = (action$, store) =>
action$.ofType('Listening_Starting')
.do((action : ShellAction) => {
var locale = store.getState().format.locale;
var onIntermediateResult = (srText : string) => { store.dispatch({ type: 'Update_Input', input: srText, source:"speech" })};
var onFinalResult = (srText : string) => {
srText = srText.replace(/^[.\s]+|[.\s]+$/g, "");
onIntermediateResult(srText);
store.dispatch({ type: 'Listening_Stop' });
store.dispatch(sendMessage(srText, store.getState().connection.user, locale));
};
var onAudioStreamStart = () => { store.dispatch({ type: 'Listening_Start' }) };
var onRecognitionFailed = () => { store.dispatch({ type: 'Listening_Stop' })};
Speech.SpeechRecognizer.startRecognizing(locale, onIntermediateResult, onFinalResult, onAudioStreamStart, onRecognitionFailed);
})
.map(_ => nullAction)
Here the bot code on the web app is called with sendMessage(srText...
, without audio.