1
votes

So far in my quest to concatenate videos with MediaCodec I've finally managed to resample 48k Hz audio to 44.1k Hz.

I've been testing joining videos together with two videos, the first one having an audio track with 22050 Hz 2 channels format, the second one having an audio track with 24000 Hz 1 channel format. Since my decoder just outputs 44100 Hz 2 channels raw audio for the first video and 48000 Hz 2 channels raw audio for the second one, I resampled the ByteBuffers that the second video's decoder outputs from 48000 Hz down to 44100 Hz using this method:

private byte[] minorDownsamplingFrom48kTo44k(byte[] origByteArray)
{
    int origLength = origByteArray.length;
    int moddedLength = origLength * 147/160;
    //int moddedLength = 187*36;
    int delta = origLength - moddedLength;
    byte[] resultByteArray = new byte[moddedLength];
    int arrayIndex = 0;
    for(int i = 0; i < origLength; i+=44)
    {
        for(int j = i; j < (i+40 > origLength ? origLength : i + 40); j++)
        {
            resultByteArray[arrayIndex] = origByteArray[j];
            arrayIndex++;
        }
        //Log.i("array_iter", i+" "+arrayIndex);
    }
    //smoothArray(resultByteArray, 3);
    return resultByteArray;
}

However, in the output video file, the video plays at a slower speed upon reaching the second video with the downsampled audio track. The pitch is the same and the noise is gone, but the audio samples just play slower.

My output format is actually 22050 Hz 2 channels, following the first video.

EDIT: It's as if the player still plays the audio as if it has a sample rate of 48000 Hz even after it's downsampled to 44100 Hz.

My questions:

  1. How do I mitigate this problem? Because I don't think changing the timestamps works in this case. I just use the decoder-provided timestamps with some offset based on the first video's last timestamp.
  2. Is the issue related to the CSD-0 ByteBuffers?
  3. If MediaCodec has the option of changing the video bitrate on the fly, would a new feature of changing the audio sample rate or channel count on the fly be feasible?
1

1 Answers

0
votes

Turns out it was something as simple as limiting the size of my ByteBuffers.

The decoder outputs 8192 bytes (2048 samples).

After downsampling, the data becomes 7524 bytes (1881 samples) - originally 7526 bytes but that amounts to 1881.5 samples, so I rounded it down.

The prime mistake was in this code where I have to bring the sample rate close to the original:

byte[] finalByteBufferContent = new byte[size / 2]; //here

for (int i = 0; i < bufferSize; i += 2) {
    if ((i + 1) * ((int) samplingFactor) > testBufferContents.length) {
        finalByteBufferContent[i] = 0;
        finalByteBufferContent[i + 1] = 0;
    } else {
        finalByteBufferContent[i] = testBufferContents[i * ((int) samplingFactor)];
        finalByteBufferContent[i + 1] = testBufferContents[i * ((int) samplingFactor) + 1];
    }
}

bufferSize = finalByteBufferContent.length;

Where size is the decoder output ByteBuffer's length and testBufferContents is the byte array I use to modify its contents (and is the one that was downsampled to 7524 bytes).

The resulting byte array's length was still 4096 bytes instead of 3762 bytes.

Changing new byte[size / 2] to new byte[testBufferContents.length / 2] resolved that problem.