Please see the implementation notes from the Speech to Text API Explorer for the recognize API you are attempting to use:
Implementation Notes
Sends audio and returns transcription results for
a sessionless recognition request. Returns only the final results; to
enable interim results, use session-based requests or the WebSocket
API. The service imposes a data size limit of 100 MB. It automatically
detects the endianness of the incoming audio and, for audio that
includes multiple channels, downmixes the audio to one-channel mono
during transcoding.
Streaming mode
For requests to transcribe live
audio as it becomes available or to transcribe multiple audio files
with multipart requests, you must set the Transfer-Encoding header to
chunked to use streaming mode. In streaming mode, the server closes
the connection (status code 408) if the service receives no data chunk
for 30 seconds and the service has no audio to transcribe for 30
seconds. The server also closes the connection (status code 400) if no
speech is detected for inactivity_timeout seconds of audio (not
processing time); use the inactivity_timeout parameter to change the
default of 30 seconds.
There are two factors here. First there is a data size limit of 100 MB, so I would make sure you do not send files larger then that to the Speech to Text service. Secondly, you can see the server will close the connection and return a 400 error if there is no speech detected for the amount of seconds defined for inactivity_timeout. It seems the default value is 30 seconds, so this matches the error you are seeing above.
I would suggest you make sure there is valid speech in the first 30 seconds of your file and/or make the inactivity_timeout parameter larger to see if the problem still exists. To make things easier, you can test the failing file and other sound files by using the API Explorer in a browser:
Speech to Text API Explorer