Resampling audio in Android from 48kHz to 44.1kHz and vice versa - pure Java, or OpenSL ES?

Question

I've managed to join audio tracks of video files using MediaCodec. There are no problems doing so if the channel count and the sample rate of both audio tracks are the same.

(for some reason the OMX.SEC.aac.dec always outputs 44100 Hz 2 channel audio if the original track is a 22050 Hz, and outputs 48000 Hz 2 channel audio if the original track is 24000 Hz.)

The problem comes in when I try appending a 24000 Hz audio track after a 22050 Hz audio track. Assuming I want to output a 22050 Hz audio track consisting of both said tracks, I'll have to resample the 24000 Hz one.

I tried this:

private byte[] minorDownsamplingFrom48kTo44k(byte[] origByteArray)
{
    int origLength = origByteArray.length;
    int moddedLength = origLength * 147/160;
    int delta = origLength - moddedLength;
    byte[] resultByteArray = new byte[moddedLength];
    int arrayIndex = 0;
    for(int i = 0; i < origLength; i+=11)
    {
        for(int j = i; j < i+10; j++)
        {
            resultByteArray[arrayIndex] = origByteArray[j];
            arrayIndex++;
        }
    }
    return resultByteArray;
}

It returns a byte array of 3700-something bytes and the correct audio after the encoding... behind a very loud scrambled sound.

My questions:

How do I correctly downsample the audio track without leaving such artifacts? Should I use average values?
Should I use a resampler implemented using OpenSL ES to make the process faster and/or better?

Your data is probably 16bit and especially if it's two channels you can't just skip bytes from the data. Also if you want to do it properly you should filter the end result to make sure there's no aliasing artefacts or anything. But I think the first part is the main problem. Copying 40 bytes and skipping 4 might fix that. — Sami Kuhmonen
16bit and upsampled (the source had a 22050 Hz sample rate, and the decoder output was 44100 Hz), apparently. Anyway, how do I apply this filter on a sample-to-sample basis in Java? — Gensoukyou1337
Copying 40 bytes and skipping 4 works, but the audio in the final output video makes the whole video play at a slower speed. — Gensoukyou1337
Could you please advice how to implement resampler using OpenSL? — Mike

Sami Kuhmonen Sami Kuhmonen · Accepted Answer · 2016-10-18T10:24:38

The main issue is that you're just skipping bytes when you should be skipping samples.

Each sample is 16 bits, so two bytes. If the audio is stereo there are four bytes per sample. You have to always skip that many bytes or otherwise your samples will get completely mixed up.

Using the same ratio (10/11) you can use 40/44 to always skip a full four-byte sample and keep the samples proper.

As to why the resulting video is playing at a different speed, that's something completely different.

Resampling audio in Android from 48kHz to 44.1kHz and vice versa - pure Java, or OpenSL ES?

1 Answers