I am trying to transcript a WAV audio file using Google Speech to Text API. Most of the text extraction work except one WAV file which i always hit
Unhandled error { Error: 3 INVALID_ARGUMENT: WAV header indicates an unsupported format.
I have referred https://cloud.google.com/speech-to-text/docs/encoding
Note: Speech-to-Text supports WAV files with LINEAR16 or MULAW encoded audio.
and tried both codec yet it still failed.
I tried to get detail of the wav via soxi command
>> soxi org\ hearing.WAV
Input File : 'org hearing.WAV'
Channels : 1
Sample Rate : 22050
Precision : 13-bit
Duration : 00:14:59.99 = 19844721 samples ~ 67499.1 CDDA sectors
File Size : 9.99M
Bit Rate : 88.8k
Sample Encoding: 4-bit IMA ADPCM
May i know if the encoding format supported, "4-bit IMA ADPCM"? Or what's the correspondent codec from supported format? https://cloud.google.com/speech-to-text/docs/encoding#audio-encodings
If it's really not a supported codec from source file, anyway to convert to FLAC/WAV supported codec using some GCP function, then extract the text without user's manual conversion? Coz i am dealing with admin worker which need a dummy-friendly extraction function.