2
votes

I am currently using React as my front-end and Java Spring Boot as my server. I am using React-Mic to record audio, passed the audio to FormData and send a HTTP post request with that FormData as body to my Java server. However, as the recorded audio is in webm, there is no appropriate encoding for Google Speech-To-Text API. Any idea how I can convert the audio to flac or any other format type supported by Google Speech-To-Text API?

1

1 Answers

1
votes

Could probably use JAVE2 to convert from webm to mp3 (or other).

https://github.com/a-schild/jave2

The sample in the readme should point you in the right direction:

try {                                                         
 File source = new File("file path"); // Path to your webm                   
 File target = new File("file path");  // Output path   

 //Audio Attributes                                       
 AudioAttributes audio = new AudioAttributes();              
 audio.setCodec("libmp3lame"); // Change this to flac if you prefer flac                               
 audio.setBitRate(128000);                                   
 audio.setChannels(2);                                       
 audio.setSamplingRate(44100);                               

 //Encoding attributes                                       
 EncodingAttributes attrs = new EncodingAttributes();        
 attrs.setFormat("mp3"); // Change to flac if you prefer flac                                     
 attrs.setAudioAttributes(audio);                            

 //Encode                                                    
 Encoder encoder = new Encoder();                            
 encoder.encode(new MultimediaObject(source), target, attrs); 
 // The target file should now be present at the path specified above


} catch (Exception ex) {                                      
 ex.printStackTrace();                                        
}                     

After conversion you'd then have a file object which you could convert to a byte[] to send to the speech to text api like in this sample:

https://github.com/GoogleCloudPlatform/java-docs-samples/blob/master/speech/cloud-client/src/main/java/com/example/speech/QuickstartSample.java