I am helping a client convert a video file using ffmpeg
and they originally used -b:a 64k
while transcoding their video to audio at a sampling rate (-ar 44100
argument in ffmpeg
) of 44100. Their objective is that they want to generate the most accurate transcriptions using the Google Cloud Speech To Text API.
While combing through their documentation I did not find anything on how bit rate impacts the accuracy of the transcription. So my question is thus - would using a higher bit rate such as 128k
help me in getting better transcriptions or does it not matter?