0
votes

I'm trying to get an audio stream from a text-to-speech interface (MaryTTS) and stream it in an SIP RTP session (using Peers).

Peers wants a SoundSource to stream audio, which is an interface defined as

public interface SoundSource {

    byte[] readData();

}

and MaryTTS synthesises a String to an AudioInputStream. I tried to simply read the stream and buffering it out to Peers implementing SoundSource, in the lines of

MaryInterface tts = new LocalMaryInterface();
AudioInputStream audio = tts.generateAudio("This is a test.");
SoundSource soundSource = new SoundSource() {

    @Override
    public byte[] readData() {
        try {
            byte[] buffer = new byte[1024];
            audio.read(buffer);
            return buffer;
        } catch (IOException e) {
            return null;
        }
    }
};
// issue call with soundSource using Peers

the phone rings, and I hear a slow, low, noisy sound instead of the synthesised speech. I guess it could be something with the audio format the SIP RTP session expects, since Peers documentation states

The sound source must be raw audio with the following format: linear PCM 8kHz, 16 bits signed, mono-channel, little endian.

How can I convert/read the AudioInputStream to satisfy these requirements?

1

1 Answers

1
votes

One way I know is this - given the systems that you are using I dont know if it will pass:

ByteArrayOutputStream outputStream=new ByteArrayOutputStream();
  try {
    byte[] data=new byte[1024];
    while(true) {
      k=audioInputStream.read(data, 0, data.length);
      if(k<0) break;
      outputStream.write(data, 0, k);
    }
    AudioFormat af=new AudioFormat(8000f, 16, 1, true, false);
    byte[] audioData=outputStream.toByteArray();
    InputStream byteArrayInputStream=new ByteArrayInputStream(audioData);
    AudioInputStream audioInputStream2=new AudioInputStream(byteArrayInputStream, af, audioData.length/af.getFrameSize());
    outputStream.close();
  }
  catch(Exception ex) { ex.printStackTrace(); }
}

There is also this

AudioSysytem.getAudioInputStream(AudioFormat targetFormat, AudioInputStream sourceStream)

which you can use with the above parameters.