I have a sample .webm file recorded using MediaRecorder in Chrome Browser. When I use Google speech java client to get transcription for the video, it returns empty transcription. Here is what my code looks like
SpeechSettings settings = null;
Path path = Paths.get("D:\\scrap\\gcp_test.webm");
byte[] content = null;
try {
content = Files.readAllBytes(path);
settings = SpeechSettings.newBuilder().setCredentialsProvider(credentialsProvider).build();
} catch (IOException e1) {
throw new IllegalStateException(e1);
}
try (SpeechClient speech = SpeechClient.create(settings)) {
// Builds the request for remote FLAC file
RecognitionConfig config = RecognitionConfig.newBuilder()
.setEncoding(AudioEncoding.LINEAR16)
.setLanguageCode("en-US")
.setUseEnhanced(true)
.setModel("video")
.setEnableAutomaticPunctuation(true)
.setSampleRateHertz(48000)
.build();
RecognitionAudio audio = RecognitionAudio.newBuilder().setContent(ByteString.copyFrom(content)).build();
// RecognitionAudio audio = RecognitionAudio.newBuilder().setUri("gs://xxxx/gcp_test.webm") .build();
// Use blocking call for getting audio transcript
RecognizeResponse response = speech.recognize(config, audio);
List<SpeechRecognitionResult> results = response.getResultsList();
for (SpeechRecognitionResult result : results) {
SpeechRecognitionAlternative alternative = result.getAlternativesList().get(0);
System.out.printf("Transcription: %s%n", alternative.getTranscript());
}
} catch (Exception e) {
e.printStackTrace();
System.err.println(e.getMessage());
}
If, I use the same file and visit https://cloud.google.com/speech-to-text/ and upload file in the demo section. It seems to work fine and shows transcription. I am clueless about whats going wrong here. I verified the request sent by demo and here it what looks like
I am sending the exact set of parameters, but that didn't work. Tried uploading file to Cloud storage but that too gave same result (no transcription).