3
votes

As I understand, the audio byte array that I am using (PCM Stereo 16bit) is 4 bytes per sample. I noticed that when you invert the Byte value (ie. -128 to 128 and 128 to -128) it does not put the sound in the surround channel. It sounds the same (front audio). I experimented with inverting every other byte (every 2 bytes) rather than all of the bytes and got something like surround sound, but it's very dirty and choppy. How exactly do I manipulate a regular PCM 16bit Stereo WAV file (in byte array form) so that the audio is placed in the surround channels?

My Code:

public byte[] putInSurround(byte[] audio) {
        for (int i = 0; i < audio.length; i += 4) {
            int i0 = audio[i + 0];
            int i1 = audio[i + 1];
            int i2 = audio[i + 2];
            int i3 = audio[i + 3];
            if (0 > audio[i + 0]) {
                i0 = Math.abs(audio[i + 0]);
            }
            if (0 < audio[i + 0]) {
                i0 = 0 - audio[i + 0];
            }
            if (0 > audio[i + 1]) {
                i1 = Math.abs(audio[i + 1]);
            }
            if (0 < audio[i + 1]) {
                i1 = 0 - audio[i + 1];
            }
            if (0 > audio[i + 2]) {
                i2 = Math.abs(audio[i + 2]);
            }
            if (0 < audio[i + 2]) {
                i2 = 0 - audio[i + 2];
            }
            if (0 > audio[i + 3]) {
                i3 = Math.abs(audio[i + 3]);
            }
            if (0 < audio[i + 3]) {
                i3 = 0 - audio[i + 3];
            }
            audio[i + 0] = (byte) i0;
            //audio[i + 1] = (byte) i1; <-- Commented Out For Every Other Byte.
            //audio[i + 2] = (byte) i2; <-- Commented Out For Every Other Byte.
            audio[i + 3] = (byte) i3;
        }
        return audio;
    }
1

1 Answers

0
votes

I am not in any manner, shape or form an expert in DSP, but I have a few observations that might be helpful:

  • You parse your array in increments of 4 bytes, which correctly corresponds to a single 16-bit stereo sound sample: 2 channels * 16 bits = 32 bits = 4 bytes.

    Now, I may not understand what you are trying to do, but in modern surround audio, the surround channels are usually independent of each other. That means that you will need more than 4 bytes per surround audio sample. If, for example, you have 5 channels, you will need 10 bytes/sample, which probably means that you need separate input and output arrays in your code.

    There are methods such as Dolby Surround and Dolby Pro Logic, where the surround channels are matrix-encoded into the two stereo channels, but the DSP mathematics involved is far more complex than what you have in your code. Not to mention the need for a special decoder and the quality loss implied by such methods.

  • Inverting each byte of a 2-byte sample makes no sense: A sample value of 1000d would become -744d. Bitwise operations like this are rarely used in DSP, if at all.

  • Usually audio samples are stored as signed 2's complement binary numbers. That makes handling them byte-wise quite complex, especially in a language with no unsigned numbers and no pointer casting such as Java. You would be better off converting the byte array into an array of short or int - or using a different programming language such as C++.

  • Inverting -128 produces +128, which cannot be stored in a signed Byte, as used by Java.

  • When "inverting each other byte", you store the inverse of i + 0 and i + 3, instead of i + 0 and i + 2 or i + 1 and i + 3.

  • The result of inverting each other byte, while still not making any sense, has a different effect, depending on whether your audio representation is little-endian or big-endian. RIFF WAV files use little-endian byte order.

    Inverting bytes 0 and 2 changes the LSB of the samples, which would merely add noise in high amplitudes and outright distortion when the dynamic range of the audio clip is limited.

    Inverting bytes 1 and 3 would approximate inverting the whole sample in high amplitudes and adding a lot of distortion in clips with limited dynamic range.

  • Inverting the whole sample, rather than individual bytes, is an approximation of a 180-degree phase-shift. I am not sure where you can use that, though...

You need to tell us what exactly you are trying to do, if you need more help than this. You should at least mention what is your expected output and which DSP algorithms you are using.