1
votes

I need real time speech recognition through Google Cloud Speech API. However it is still in beta version and there are not much helpful things available on the internet.

https://cloud.google.com/speech/docs/samples there are few samples available here but I don't see streaming API with C#, does that mean I cannot use C# for steaming my audio input the Google Cloud Speech API?

Anyone tried streaming audio input to the Cloud Speech API using .NET?

FYI, I cannot used normal Web Speech API available from Google. I need to use only Goolge Cloud Speech API.

1
There seem to be people who made it in C#. There aren't however any sample codes available. groups.google.com/forum/#!topic/cloud-speech-discuss/… If you find anything, please let me know. I've been looking for the answer myself.Hespen
Hespen, for a temporary solution I introduced node.js binary server and connected with my JS through websocket for audio streaming. Node.js then communicates with Google Cloud Speech API. This solution seems to be working well now, but I am really looking for C# to make it simple and cleanParesh Varde
Thanks for the info!Hespen
@PareshVarde could you please post few code samples for node.js implementation you have done? It will be a real help for meKiran B
Can you please share your code solution to this?Ivan Fontalvo

1 Answers

4
votes

You have to download the sample applications from here: https://cloud.google.com/speech/docs/samples

The you will find the Speech samples: QuickStart and Recognize.

The Recogize have a lot option, and one of them is Listen. This sample is streaming audio, and write the result to the console continuously.

The sample uses a protobuf byte stream for streaming. Here is the main part of the code:

var credential = GoogleCredential.FromFile( "privatekey.json" ).CreateScoped( SpeechClient.DefaultScopes );
var channel = new Grpc.Core.Channel( SpeechClient.DefaultEndpoint.ToString(), credential.ToChannelCredentials() );
var speech = SpeechClient.Create( channel );
var streamingCall = speech.StreamingRecognize();
// Write the initial request with the config.
await streamingCall.WriteAsync(
    new StreamingRecognizeRequest()
    {
        StreamingConfig = new StreamingRecognitionConfig()
        {
            Config = new RecognitionConfig()
            {
                Encoding =
                RecognitionConfig.Types.AudioEncoding.Linear16,
                SampleRateHertz = 16000,
                LanguageCode = "hu",
            },
            InterimResults = true,
        }
    } );

of course the language must be changed.

Then must be stream the content:

streamingCall.WriteAsync(
    new StreamingRecognizeRequest()
    {
        AudioContent = Google.Protobuf.ByteString
            .CopyFrom( args.Buffer, 0, args.BytesRecorded )
    } ).Wait();