2
votes

i have 2 signals, one containing audio data which is played on speakers. second one contains mic data recording the speakers simultaneously.

what ive done so far: align signals in time domain via correlation. apply fft on overlapping part of both signals and divide one by the other in order to achieve deconvolution.
what am i doing wrong as the resulting audiodata is useless.

here is my code:

       //put both signals in split complex vectors
       vDSP_ctoz((DSPComplex *)file, 2, &fftFileData, 1, nOver2);
       vDSP_ctoz((DSPComplex *)mic, 2, &fftMicData, 1, nOver2);


        //fft of both file and mic data
        vDSP_fft_zrip(fftSetup, &fftFileData, 1, log2n, FFT_FORWARD);
        vDSP_fft_zrip(fftSetup, &fftMicData, 1, log2n, FFT_FORWARD);


        //divide file data by mic data for deconvolution???
        vDSP_zvdiv(&fftFileData, 1, &fftMicData, 1, &fftMicData, 1, nOver2);  


        //inverse fft of mic-fft-data
        vDSP_fft_zrip(fftSetup, &fftMicData, 1, log2n, FFT_INVERSE);

        //scale back signal
        vDSP_vsmul(fftMicData.realp, 1, &scale, fftMicData.realp, 1, nOver2);
        vDSP_vsmul(fftMicData.imagp, 1, &scale, fftMicData.imagp, 1, nOver2);

        //copy back to float array
        vDSP_ztoc(&fftMicData, 1, (COMPLEX *) result, 2, nOver2);

edit for a little clarification: thanks to @Sammio2 i now know, that deconvolution describes my problem very well:

f*g=h

h is my recorded signal, consisting of

f, my signal i wish to recover and

g, my playback signal recorded in addition which i know, but was modified by the speaker->mic roundtrip most likely

now i need any way to recover f which is all the sound recorded in addition to g.

important: in the end i don't need a clear signal of f, just information about its loudness or level of presence. basically the noise level besides the recorded roundtrip signal g.

how should i proceed to gather my desired noise level information?

i hope this helps to understand my problem. thanks so far!

3
It looks like you may be somehow confusing impulse response convolution with additive noise. A noiseless system will have a non-zero impulse response.hotpaw2

3 Answers

3
votes

The length argument to vDSP_zvsub is the number of complex elements to be processed, not the logarithm of the number of elements. You should pass nOver2 rather than log2n.

This merely addresses the programming aspect. Other answers address signal processing issues. In particular, an FFT is linear: Given signals X and Y and constants a and b, FFT(a•X+b•Y) = a•FFT(X)+b•FFT(Y). The inverse FFT is also linear. Therefore, an inverse FFT of the difference of the FFTs of two signals should not give you a different result from subtracting two signals directly, except for the usual floating-point rounding errors.

2
votes

With vDSP_zvsub you're just doing a complex subtraction at each bin, which is probably not what you want.

It's not clear exactly what you're trying to achieve, but it sounds like you want to subtract the magnitude of one spectrum from the other, in which case you would need to do the following:

  • convert each complex frequency domain spectrum from complex to polar (magnitude + phase)
  • subtract the magnitudes at each bin
  • convert the resulting polar data back to complex
0
votes

You will need the system impulse response between the audio sent to the speaker and the audio received from the mic (DAC/ADC buffering delay, anti-aliasing filter group delay, speaker and mic responses, speed of sound in air, etc.) in order to produce a (mostly) canceling signal, either in the time domain or in the frequency domain. Note that this includes matching amplitudes as well as delays, and that one set of speakers or mics may well be "out of phase" compared to others.