0
votes

I'm trying to use the azure cognitive services, the speech to text, but as the recognition is quite bad for the language Polish, I was trying to upload the audio + transcript but then I tried every format but still give the same error: "ID20190102_083350_43_2_78120.wav: Error: normalized text is empty."

In the file I only have this :

ID20190102_083350_43_2_78120.wav dzien

I created the file by hand in notepad++ with the encode UTF-8 BOM

Thanks in advance,

1

1 Answers

1
votes

This was caused by a bug on our side. Sorry for the inconvenience. This has been fixed now in many regions, including West Europe (assuming Polish language was selected). Please try again.

The entry should be like:

ID20190102_083350_43_2_78120.wav<tab>dzien