In a comment, you stated "I am playing audio file... I read it as byte[] and then I need to normalize audio by putting values into range of [-1,1] and then I need to put that byte[] back into playing audio player"
I am making a big assumption here, but I'm guessing the the data you receive from ar.ReadData()
is a byte array of 2-channel 16-bit/44.1kHz PCM data. (side note: are you using the Alvas.Audio library?) If that is the case, here is how to do what you want.
Background
First, a little background. A 2-channel, 16-bit PCM data stream looks like this:
byte | 01 02 | 03 04 | 05 06 | 07 08 | 09 10 | 11 12 | ...
channel | Left | Right | Left | Right | Left | Right | ...
frame | First | Second | Third | ...
sample | 1st L | 1st R | 2nd L | 2nd R | 3rd L | 3rd R | ... etc.
It's important here to take note of a few things:
- Since the audio data is 16-bit, a single sample from a single channel is a
short
(2 bytes), not an int
(4 bytes), with a value in the range -32768 to 32767.
- This data is in little-endian representation, and unless your architecture is also little-endian, you can't use the .NET
BitConverter
class for the conversion.
- We don't have to split the data into per-channel streams, because we are normalizing both channels based on the single highest value from either channel.
- Converting a floating-point value to an integer value will result in quantization errors, so you probably want to use some sort of dithering (which is an entire topic in its own right).
Helper Functions
Before we jump into the actual normalization, let's make this easier on ourselves by writing a couple of helper functions to get a short
from a byte[]
and vice-versa:
short GetShortFromLittleEndianBytes(byte[] data, int startIndex)
{
return (short)((data[startIndex + 1] << 8)
| data[startIndex]);
}
byte[] GetLittleEndianBytesFromShort(short data)
{
byte[] b = new byte[2];
b[0] = (byte)data;
b[1] = (byte)(data >> 8 & 0xFF);
return b;
}
Normalization
An important distinction should be made here: audio normalization is not the same as statistical normalization. Here we are going to perform peak normalization on our audio data, amplifying the signal by a constant amount so that its peak is at the upper limit. To peak normalize audio data, we first find the largest value, subtract it from the upper limit (for 16-bit PCM data, this is 32767) to get an offset, and then increase each value by this offset.
So, to normalize our audio data, first scan through it to find the peak magnitude:
byte[] input = ar.ReadData(); // the function you used above
float biggest = -32768F;
float sample;
for (int i = 0; i < input.Length; i += 2)
{
sample = (float)GetShortFromLittleEndianBytes(input, i);
if (sample > biggest) biggest = sample;
}
At this point, biggest
contains the largest value from our audio data. Now to perform the actual normalization, we subtract biggest
from 32767 to get a value which corresponds to the offset from peak of the loudest sample in our audio data. Next we add this offset to each audio sample, effectively increasing the volume of each sample until our loudest sample is at the peak value.
float offset = 32767 - biggest;
float[] data = new float[input.length / 2];
for (int i = 0; i < input.Length; i += 2)
{
data[i / 2] = (float)GetShortFromLittleEndianBytes(input, i) + offset;
}
The last step is to convert the samples from floating-point to integer values, and store them as little-endian short
s.
byte[] output = new byte[input.Length];
for (int i = 0; i < output.Length; i += 2)
{
byte[] tmp = GetLittleEndianBytesFromShort(Convert.ToInt16(data[i / 2]));
output[i] = tmp[0];
output[i + 1] = tmp[1];
}
And we're done! Now you can send the output
byte array, which contains the normalized PCM data, to your audio player.
As a final note, keep in mind that this code isn't the most efficient; you could combine several of these loops, and you could probably use Buffer.BlockCopy()
for the array copying, as well as modifying your short
to byte[]
helper function to take a byte array as a parameter and copy the value directly into the array.
I didn't do any of this so as to make it easier to see what's going on.
And as I mentioned before, you should absolutely read up on dithering, as it will vastly improve the quality of your audio output.
I've been working on an audio project myself, so I figured all this out through some trial-and-error; I hope it helps somebody somewhere.
float
values, eachfloat
will take 4 bytes? And that in the code you've given for reading fromdata
,biggest
will always be positive, and at most 255? It feels like you're fundamentally missing how bytes and floats work... – Jon Skeet