Frequency Shifter Using FFT

Question

I've been working on a frequency shifter using a primitive FFT algorithm supplied by Rosetta Code. I understand that to frequency shift a signal of samples, one applies an FFT to the original audio, multiplies the frequency of each resulting sine-wave by the frequency-shift factor (user defined), and then adds the sine-waves back together. When I run my algorithm, the output is of extremely low quality, as though there were not enough sine waves collected within the algorithm to have reproduced the signal close to correctly in the first place. The algorithm is implemented in a class in a header file and called (correctly) elsewhere.

#include <complex>
#include <valarray>

typedef std::complex<double> Complex;
typedef std::valarray<Complex> CArray;

class FrequencyShifter {
float sampleRate;
public:
    FrequencyShifter() {

    }
    void setSampleRate(float inSampleRate) {
        sampleRate = inSampleRate;
    }
    double abs(double in0) {
        if (in0>=0) return in0;
        else return -in0;
    }
    void fft(CArray& x)
    {
        const size_t N = x.size();
        if (N <= 1) return;

        // divide
        CArray even = x[std::slice(0, N/2, 2)];
        CArray  odd = x[std::slice(1, N/2, 2)];

        // conquer
        fft(even);
        fft(odd);

        // combine
        for (size_t k = 0; k < N/2; ++k)
        {
            Complex t = std::polar(1.0, -2 * PI * k / N) * odd[k];
            x[k    ] = even[k] + t;
            x[k+N/2] = even[k] - t;
        }
    }
    double convertToReal(double im, double re) {
        return sqrt(abs(im*im - re*re));
    }
    void processBlock(float *inBlock, const int inFramesToProcess, float scale) {
        //inFramesToProcess is the amount of samples in inBlock
        Complex *copy = new Complex[inFramesToProcess];
        for (int frame = 0; frame<inFramesToProcess; frame++) {
            copy[frame] = Complex((double)inBlock[frame], 0.0);
        }
        CArray data(copy, inFramesToProcess);
        fft(data);
        const float freqoffsets = sampleRate/inFramesToProcess;
        for (float x = 0; x<data.size()/2; x++) {
            for (float frame = 0; frame<inFramesToProcess; frame++) {
                inBlock[(int)frame] = (float)(convertToReal(data[(int)x].imag(), data[(int)x].real())*sin(freqoffsets*x*frame*scale));
            }
        }
    }
};

I'm assuming that part of the problem is that I'm only including sampleRate/inFramesToProcess frequencies for the sine waves to cover. Would sending larger audio files (thus larger *inBlocks and inFramesToProcesss) make the audio less grainy? How would I accomplish this without just changing the values or lengths of the arguments?

@1201ProgramAlarm When I test the output of *inBlock there is no level (audio level is 0 or some other error was encountered). Essentially, there is some mistake in the algorithm which I am unable to detect and fix. — Linus Rastegar
Is convertToReal the correct way round? Trivially, if inFramesToProcess is 1, data will have a complex number with no imaginary part in it. fft won't do anything to it, so when this gets converted back you'll try to take the sqrt of a negative number. Nontrivially, fft won't do anything to the last element of x if x.size() is odd. — 1201ProgramAlarm
@1201ProgramAlarm Ah, thanks for this heads up! I had not realized either of these two things. I'll try to fix them as soon as I can and update the post. This may be the fix I'm looking for. — Linus Rastegar
@1201ProgramAlarm I've reached a dead end trying to fix these two problems. I'd appreciate it if you could flesh out your reply into an answer. — Linus Rastegar

SleuthEye SleuthEye · Accepted Answer · 2017-02-01T04:21:20

Here is an updated version of processBlock with a number of tweaks required to implement the frequency shift, which I will describe below:

void processBlock(float *inBlock, const int inFramesToProcess, float scale) {
    //inFramesToProcess is the amount of samples in inBlock
    Complex *copy = new Complex[inFramesToProcess];
    for (int frame = 0; frame<inFramesToProcess; frame++) {
        copy[frame] = Complex((double)inBlock[frame], 0.0);
    }
    CArray data(copy, inFramesToProcess);
    fft(data);
    const float freqoffsets = 2.0*PI/inFramesToProcess;
    const float normfactor  = 2.0/inFramesToProcess;
    for (int frame = 0; frame<inFramesToProcess; frame++) {
        inBlock[frame] = 0.5*data[0].real();
        for (int x = 1; x<data.size()/2; x++) {
            float arg = freqoffsets*x*frame*scale;
            inBlock[frame] += data[x].real()*cos(arg) - data[x].imag()*sin(arg);
        }
        inBlock[frame] *= normfactor;
    }
}

Derivation

The spectrum you get from the FFT is complex-valued, which could be seen as providing a representation of your signals in terms of sine and cosine waves. Reconstructing the time-domain waveform can be done using the inverse transform, which would be given by the relation:

Taking advantage of the frequency spectrum symmetry, this can be expressed as:

or equivalently:

As you might have noticed the term at index 0 and N/2 are special cases with purely real coefficients in the frequency domain. For simplicity, assuming the spectrum does not go all the way to N/2, you could drop that N/2 term and still get a reasonable approximation. For the other terms you would get a contribution which can be implemented as

normfactor = 2.0/inFramesToProcess;
normfactor*(data[x].real()*cos(arg) - data[x].imag()*sin(arg))

You would of course need to add all these contributions into the final buffer inBlock[frame], and not simply overwriting previous results:

inBlock[frame] += normfactor*(data[x].real()*cos(arg) - data[x].imag()*sin(arg));
//             ^^

Note that the normalization can be done on the final result after the loop to reduce the number of multiplications. In doing so, we must pay special attention to the DC term at index 0 (which has a coefficient of 1/N instead of 2/N):

inBlock[frame] = 0.5*data[0].real();
for (int x = 1; x<data.size()/2; x++) {
    float arg = freqoffsets*x*frame*scale;
    inBlock[frame] += data[x].real()*cos(arg) - data[x].imag()*sin(arg);
}
inBlock[frame] *= normfactor;

Finally, when generating the tones, the phase argument arg to sin and cos should be of the form 2*pi*k*n/inFramesToProcess (prior to the application of the scale factor), where n is the time-domain sample index and k is the frequency domain index. The end result is that the computed frequency increment freqoffsets should really be 2.0*PI/inFramesToProcess.

Notes

The FFT algorithm works on the assumption that your underlying time-domain signal is periodic with the period of your block length. As a result there may be audible discontinuities between the blocks.
Future readers should be made aware that this does not shift the spectrum by a constant amount but rather squish or expand the spectrum as the frequencies are scaled by a multiplicative factor. For example a signal which includes 100-200Hz components might get squished to 75-150Hz by a factor 0.75. Notice how the lower limit was down shifted by 25Hz while the upper limit was down shifter by 50Hz.

Frequency Shifter Using FFT

1 Answers