I make a simple sound equalizer that operates in frequency domain and lets user to adjust frequencies in sound by using 4 sliders. The first one responsible for 0 - 5kHz, the fourth one for 15-20kHz.
Steps are as follows:
- I read wav file and store it in float array
- I perform complex fft on that array (separately for left and right channel)
- I multiply real and imaginary parts of bins representing 0-5kHz frequencies (both positive and negative) by
1.13.981 to increase these low frequencies by10%12dB in the final sound. - I perform ifft on array
- I alternate real parts of left and right channels (returned by ifft) to create the final audio
The problem is that after this process the sound is distorted. It sounds like the speakers were not plugged in correctly. I found that if I divide values returned by ifft by arbitrary constant then the final sound is right, but is much quieter. I make the division in time domain, on the results from ifft.
The problem doesn't occur if I multiply frequencies by a number less than 1. So if frequencies are attenuated no further division in time domain is needed.
I suppose there is a mistake in the whole process. But if all steps are fine, how should I deal with distorted sound? Is dividing in time domain a proper solution? What number should I use to divide the results then so the sound is not distorted?
EDIT
This is the code I use to perform presented steps. I use Apache Commons math implementation of FFT and SimpleAudioConversion
class taken from there http://stackoverflow.com/a/26824664/2891664
// read file and store playable content in byte array
File file = new File("/home/kamil/Downloads/Glory.wav");
AudioInputStream in = AudioSystem.getAudioInputStream(file);
AudioFormat fmt = in.getFormat();
byte[] bytes = new byte[in.available()];
int result = in.read(bytes);
// convert bytes to float array
float[] samples = new float[bytes.length * 8 / fmt.getSampleSizeInBits()];
int validSamples = SimpleAudioConversion.decode(bytes, samples, result, fmt);
// find nearest power of 2 to zero-pad array in order to use fft
int power = 0;
while (Math.pow(2, power) < samples.length / 2)
power++;
// divide data into left and right channels
double[][] left = new double[2][(int) Math.pow(2, power)];
double[][] right = new double[2][(int) Math.pow(2, power)];
for (int i = 0; i < samples.length / 2; i++) {
left[0][i] = samples[2 * i];
right[0][i] = samples[2 * i + 1];
}
//fft
FastFourierTransformer.transformInPlace(left, DftNormalization.STANDARD, TransformType.FORWARD);
FastFourierTransformer.transformInPlace(right, DftNormalization.STANDARD, TransformType.FORWARD);
// here I amplify the 0-4kHz frequencies by 12dB
// 0-4kHz is 1/5 of whole spectrum, and since there are negative frequencies in the array
// I iterate over 1/10 and multiply frequencies on both sides of the array
for (int i = 1; i < left[0].length / 10; i++) {
double factor = 3.981d; // ratio = 10^(12dB/20)
//positive frequencies 0-4kHz
left[0][i] *= factor;
right[0][i] *= factor;
left[1][i] *= factor;
right[1][i] *= factor;
// negative frequencies 0-4kHz
left[0][left[0].length - i] *= factor;
right[0][left[0].length - i] *= factor;
left[1][left[0].length - i] *= factor;
right[1][left[0].length - i] *= factor;
}
//ifft
FastFourierTransformer.transformInPlace(left, DftNormalization.STANDARD, TransformType.INVERSE);
FastFourierTransformer.transformInPlace(right, DftNormalization.STANDARD, TransformType.INVERSE);
// put left and right channel into array
float[] samples2 = new float[(left[0].length) * 2];
for (int i = 0; i < samples2.length / 2; i++) {
samples2[2 * i] = (float) left[0][i];
samples2[2 * i + 1] = (float) right[0][i];
}
// convert back to byte array which can be played
byte[] bytes2 = new byte[bytes.length];
int validBytes = SimpleAudioConversion.encode(samples2, bytes2, validSamples, fmt);
You may listen to the sound here https://vocaroo.com/i/s095uOJZiewf
new double[2][(int) Math.pow(2, power)];
zero-initialize the array in Hava? It wouldn’t in C++, hence my asking. – Cris Luengo