'google-cloud-speech' JSON output of speech to text returns unreadable text

Question

I sent a speech to text request like this: gcloud ml speech recognize-long-running gs://audio_abcd/5.flac --language-code=iw-IL --async --audio-channel-count=2 --separate-channel-recognition

but the output I got is not readable:

   {
     "@type": "type.googleapis.com/google.cloud.speech.v1.LongRunningRecognizeResponse",
     "results": [
        {
        "alternatives": [
        {
          "confidence": 0.9204097,
          "transcript": "??? ??? ?????? ????? ?????? ????? ??? ????? ?????? ?????? ?????? ???? ????? ?????? ?????? ?????? ???? ???? ???? ??? ?? ???? ???? ?? ?? ?? ???? ????? ????? ?? ??? ???? ??? ????? ?????? ????? ???? ?? ???? ???? ??? ???? ????? ??? ???? ????? ????? ????? ????? ??? ???? ?????? ??? ????? ?????? ?? ?? ??? ?? ?? ???? ???? ?? ???? ????? ????? ???? ???? ????? ?? ??????? ????? ??? ????? ??? ???? ?? ???? ???? ????? ?????? ???? ??????? ????? ??? ????? ?? ?? ?? ?? ????? ????? ????? ??? ??????? ??? ????? ??"
        }
      ],
      "channelTag": 2
    },

any idea how I either extract the hebrew characters from the output, or specify an output character set encoding?

Thanks

Eliel Louzoun Eliel Louzoun · Accepted Answer · 2021-05-26T16:22:00

The issue was not in the output of the speech to text, it was in the redirection to a file, which seems to not support special character sets. When I sent the output to the console it was OK. I assume sending to a Cloud Storage bucket would do the trick also.

'google-cloud-speech' JSON output of speech to text returns unreadable text

1 Answers