0
votes

I'm using Google Cloud Speech to text api in Java.

I'm getting 0 results when I call speechClient.recognize

pom.xml:

<dependency>
    <groupId>com.google.cloud</groupId>
    <artifactId>google-cloud-speech</artifactId>
    <version>0.80.0-beta</version>
</dependency>

Java code:

import java.io.FileInputStream;
import java.nio.file.Files;
import java.nio.file.Path;
import java.nio.file.Paths;
import java.util.List;
import com.google.api.gax.core.FixedCredentialsProvider;
import com.google.auth.oauth2.GoogleCredentials;
import com.google.cloud.speech.v1.RecognitionAudio;
import com.google.cloud.speech.v1.RecognitionConfig;
import com.google.cloud.speech.v1.RecognitionConfig.AudioEncoding;
import com.google.cloud.speech.v1.RecognizeResponse;
import com.google.cloud.speech.v1.SpeechClient;
import com.google.cloud.speech.v1.SpeechRecognitionAlternative;
import com.google.cloud.speech.v1.SpeechRecognitionResult;
import com.google.cloud.speech.v1.SpeechSettings;
import com.google.protobuf.ByteString;

public class SpeechToText {

    public static void main(String[] args) {

        // Instantiates a client
        try {

            String jsonFilePath = System.getProperty("user.dir") + "/serviceaccount.json";
            FileInputStream credentialsStream = new FileInputStream(jsonFilePath);
            GoogleCredentials credentials = GoogleCredentials.fromStream(credentialsStream);
            FixedCredentialsProvider credentialsProvider = FixedCredentialsProvider.create(credentials);

            SpeechSettings speechSettings = 
                    SpeechSettings.newBuilder()
                        .setCredentialsProvider(credentialsProvider)
                        .build();       

            SpeechClient speechClient = SpeechClient.create(speechSettings);

            //SpeechClient speechClient = SpeechClient.create();

            // The path to the audio file to transcribe         
            String fileName = System.getProperty("user.dir") + "/call-recording-790.opus";

            // Reads the audio file into memory
            Path path = Paths.get(fileName);
            byte[] data = Files.readAllBytes(path);
            ByteString audioBytes = ByteString.copyFrom(data);

            System.out.println(path.toAbsolutePath());

            // Builds the sync recognize request
            RecognitionConfig config = RecognitionConfig.newBuilder().setEncoding(AudioEncoding.LINEAR16)
                    .setSampleRateHertz(8000).setLanguageCode("en-US").build();

            RecognitionAudio audio = RecognitionAudio.newBuilder().setContent(audioBytes).build();

            System.out.println("recognize builder");

            // Performs speech recognition on the audio file
            RecognizeResponse response = speechClient.recognize(config, audio);
            List<SpeechRecognitionResult> results = response.getResultsList();

            System.out.println(results.size()); // ***** HERE 0

            for (SpeechRecognitionResult result : results) {

                // There can be several alternative transcripts for a given chunk of speech.
                // Just use the
                // first (most likely) one here.
                SpeechRecognitionAlternative alternative = result.getAlternativesList().get(0);
                System.out.printf("Transcription: %s%n", alternative.getTranscript());
            }
        } catch (Exception e) {
            System.out.println(e);
        }
    }
}

In the code above, I'm getting results.size as 0. When I upload the same opus file on demo at https://cloud.google.com/speech-to-text/, it gives output text correctly.

So why is the recognize call giving zero results?

1

1 Answers

0
votes

There could be 3 reasons for Speech-to-Text to return an empty response:

  1. Audio is not clear.
  2. Audio is not intelligible.
  3. Audio is not using the proper encoding.

From what I can see, reason 3 is the most possible cause of your issue. To resolve this, check this page to know how to verify the encoding of your audio file which must match the parameters you sent in InitialRecognizeRequest.