Speech-to-text large audio files [Microsoft Speech API]

Question

What is the best way to transcribe medium/large audio files, ~ 6-10 mins each file, using Microsoft Speech API? Something like batch audio files transcription?

I have used the code provided in https://docs.microsoft.com/en-us/azure/cognitive-services/speech-service/speech-to-text-sample, for continuously transcribing speech, but it stops transcribing at some point. Is there any restriction on the transcription? I am only using the free trial account atm.

Btw, I assume there is no difference between Bing Speech API and the new Speech service API, right?

Thanks everyone!

Could you share your code @Blue482? I would like to see it :-) — Beckenbaur93

wolfma wolfma · Accepted Answer · 2018-06-19T18:05:45

thank you for your feedback.

I agree the sample (and the documentation you are looking at) is not very clear, we will update this soon.

The sample uses RecognizeAsync, and it should be call RecognizeOnceAsync. It is currently just trying to return the FIRST FinalResult from the service. You should use Start/StopRecognizeAsync, and register to receive Result events.

Again, sorry for the bad documentation here, we will update this soon, and also will rename the API probably in a refresh.

If you have audio files, you could also use the batch transcription feature. Perhaps that helps? https://docs.microsoft.com/en-us/azure/cognitive-services/speech-service/batch-transcription

Cheers Wolfgang

Speech-to-text large audio files [Microsoft Speech API]

2 Answers