I have followed the sample application to generate speech from text using below GitHub repository.
https://github.com/Azure-Samples/Cognitive-Speech-TTS/tree/master/Samples-Http/CSharp
My application is running fine only problem is speak rate or break/pause after each word.
Input text: y u 7 f s d 2 3 e
Following is sample SSML I am using:
<speak version="1.0" xmlns="http://www.w3.org/2001/10/synthesis" xmlns:mstts="http://www.w3.org/2001/mstts" xml:lang="en-IN"><voice xml:lang="en-IN" name="Microsoft Server Speech Text to Speech Voice (en-IN, Ravi, Apollo)">y u 7 f s d 2 3 e</voice></speak>
I want to pause after every alphabet. As I am using this audio to get captcha text in audio mode.
Please suggest a correct approach.
P.S: I don't want to repeat whole code by copy paste. (using sample from GIT)
I have even followed the conversation in comments from a link below with no luck.
https://docs.microsoft.com/en-us/azure/cognitive-services/speech/home