3
votes

I make a simple sound equalizer that operates in frequency domain and lets user to adjust frequencies in sound by using 4 sliders. The first one responsible for 0 - 5kHz, the fourth one for 15-20kHz.

Steps are as follows:

  1. I read wav file and store it in float array
  2. I perform complex fft on that array (separately for left and right channel)
  3. I multiply real and imaginary parts of bins representing 0-5kHz frequencies (both positive and negative) by 1.1 3.981 to increase these low frequencies by 10% 12dB in the final sound.
  4. I perform ifft on array
  5. I alternate real parts of left and right channels (returned by ifft) to create the final audio

The problem is that after this process the sound is distorted. It sounds like the speakers were not plugged in correctly. I found that if I divide values returned by ifft by arbitrary constant then the final sound is right, but is much quieter. I make the division in time domain, on the results from ifft.

The problem doesn't occur if I multiply frequencies by a number less than 1. So if frequencies are attenuated no further division in time domain is needed.

I suppose there is a mistake in the whole process. But if all steps are fine, how should I deal with distorted sound? Is dividing in time domain a proper solution? What number should I use to divide the results then so the sound is not distorted?

EDIT

This is the code I use to perform presented steps. I use Apache Commons math implementation of FFT and SimpleAudioConversion class taken from there http://stackoverflow.com/a/26824664/2891664

// read file and store playable content in byte array
File file = new File("/home/kamil/Downloads/Glory.wav");
AudioInputStream in = AudioSystem.getAudioInputStream(file);
AudioFormat fmt = in.getFormat();
byte[] bytes = new byte[in.available()];
int result = in.read(bytes);

// convert bytes to float array
float[] samples = new float[bytes.length * 8 / fmt.getSampleSizeInBits()];
int validSamples = SimpleAudioConversion.decode(bytes, samples, result, fmt);

// find nearest power of 2 to zero-pad array in order to use fft
int power = 0;
while (Math.pow(2, power) < samples.length / 2)
    power++;

// divide data into left and right channels
double[][] left = new double[2][(int) Math.pow(2, power)];
double[][] right = new double[2][(int) Math.pow(2, power)];

for (int i = 0; i < samples.length / 2; i++) {
    left[0][i] = samples[2 * i];
    right[0][i] = samples[2 * i + 1];
}

//fft
FastFourierTransformer.transformInPlace(left, DftNormalization.STANDARD, TransformType.FORWARD);
FastFourierTransformer.transformInPlace(right, DftNormalization.STANDARD, TransformType.FORWARD);

// here I amplify the 0-4kHz frequencies by 12dB
// 0-4kHz is 1/5 of whole spectrum, and since there are negative frequencies in the array
// I iterate over 1/10 and multiply frequencies on both sides of the array
for (int i = 1; i < left[0].length / 10; i++) {
    double factor = 3.981d; // ratio = 10^(12dB/20)
    //positive frequencies 0-4kHz
    left[0][i] *= factor;
    right[0][i] *= factor;
    left[1][i] *= factor;
    right[1][i] *= factor;

    // negative frequencies 0-4kHz
    left[0][left[0].length - i] *= factor;
    right[0][left[0].length - i] *= factor;
    left[1][left[0].length - i] *= factor;
    right[1][left[0].length - i] *= factor;
}

//ifft
FastFourierTransformer.transformInPlace(left, DftNormalization.STANDARD, TransformType.INVERSE);
FastFourierTransformer.transformInPlace(right, DftNormalization.STANDARD, TransformType.INVERSE);

// put left and right channel into array
float[] samples2 = new float[(left[0].length) * 2];
for (int i = 0; i < samples2.length / 2; i++) {
    samples2[2 * i] = (float) left[0][i];
    samples2[2 * i + 1] = (float) right[0][i];
}

// convert back to byte array which can be played
byte[] bytes2 = new byte[bytes.length];
int validBytes = SimpleAudioConversion.encode(samples2, bytes2, validSamples, fmt);

You may listen to the sound here https://vocaroo.com/i/s095uOJZiewf

1
Depending on what FFT library you are using you may need to divide by N, where N is the size of the FFT.Paul R
I use Appache Commons Math library which performs normalization.mrJoe
You should probably mention that in your question then - hit the edit link above to include this information.Paul R
I updated the question and posted my code. Could you look at this?mrJoe
Does new double[2][(int) Math.pow(2, power)]; zero-initialize the array in Hava? It wouldn’t in C++, hence my asking.Cris Luengo

1 Answers

4
votes

If you amplify in either domain, you can potentially end up clipping the signal (which can sound horrible).

So you might need to check your ifft results to see if any sample values exceed the allowed range (usually -32768 to 32768, or -1.0 to 1.0), that your audio system allows. The way to avoid any found clipping is to either reduce the gain applied to the fft bins, or reduce the amplitude of the original input signal or the total ifft result.

The search term for a dynamic gain control process is AGC (Automatic Gain Control), which is non-trivial to do well.

e.g. if the volume for any particular frequency bin is already at "10", your computer's knob doesn't have an "11".