7
votes

I want transcribe longer audio files (at least 5 minutes) using REST APIs from Microsoft. There are a lot of different products and names, e.g. Speech service API or Bing Speech API. None of the REST APIs I tried so far supports transcribing longer audio files.

The documentation states there is a REST API exactly for this case: https://docs.microsoft.com/en-us/azure/cognitive-services/speech-service/batch-transcription

What is the endpoint for this service?

1

1 Answers

8
votes

There is a sample available on GitHub here: https://github.com/PanosPeriorellis/Speech_Service-BatchTranscriptionAPI

The endpoint is CRIS's endpoint, as in this code:

private const string HostName = "cris.ai";
// ...
var client = CrisClient.CreateApiV2Client(SubscriptionKey, HostName, Port);

Then I found on the documentation that the API is exposed on Swagger (link visible here), so it's easier to explore the methods available (switch from 2.0beta to 2.0 on top):

So to create a new transcription, the path is: /api/speechtotext/v2.0/transcriptions, called with the POST method, so the full endpoint is:

Please note that the level of your subscription key needed to use the transcription must be a Standard level pricing S0, not Free one.