1
votes

I was able to visualize spectrogram using library of musicg from https://code.google.com/p/musicg/ but I found several odd things which I don't really understand. I tried to use a wav file with sample rate 22050 and perform fft using 1024 samples with 50% overlap using blackmann window. The result of the calculation is two-dimensional array (spectrogram[time][frequency]=intensity). My question is if the second dimension called frequency, why the size of it only 256? is it related to frequency width bin?then how I determine the frequency? When I tried to use 512 samples, the size reduce to half(128).

then should we normalize the spectrogram?

spectrogram result

here is the code I got from musicg

short[] amplitudes=wave.getSampleAmplitudes();

    int numSamples = amplitudes.length;
    int pointer=0;
    // overlapping
    if (overlapFactor>1){

        int numOverlappedSamples=numSamples*overlapFactor;
        int backSamples=fftSampleSize*(overlapFactor-1)/overlapFactor;
        int fftSampleSize_1=fftSampleSize-1;
        short[] overlapAmp= new short[numOverlappedSamples];
        pointer=0;
        for (int i=0; i<amplitudes.length; i++){
            overlapAmp[pointer++]=amplitudes[i];
            if (pointer%fftSampleSize==fftSampleSize_1){
                // overlap
                i-=backSamples;
            }
        }
        numSamples=numOverlappedSamples;
        amplitudes=overlapAmp;
    }
    // end overlapping

    numFrames=numSamples/fftSampleSize;
    framesPerSecond=(int)(numFrames/wave.length()); 

    // set signals for fft (windowing)
    WindowFunction window = new WindowFunction();
    window.setWindowType("BLACKMAN");
    double[] win=window.generate(fftSampleSize);

    double[][] signals=new double[numFrames][];
    for(int f=0; f<numFrames; f++) {
        signals[f]=new double[fftSampleSize];

        int startSample=f*fftSampleSize;
        for (int n=0; n<fftSampleSize; n++){

            signals[f][n]=amplitudes[startSample+n]*win[n];                         
        }
    }
    // end set signals for fft

    absoluteSpectrogram=new double[numFrames][];
    // for each frame in signals, do fft on it
    FastFourierTransform fft = new FastFourierTransform();
    for (int i=0; i<numFrames; i++){            
        absoluteSpectrogram[i]=fft.getMagnitudes(signals[i]);
    }

    if (absoluteSpectrogram.length>0){

        numFrequencyUnit=absoluteSpectrogram[0].length;
        unitFrequency=(double)wave.getWaveHeader().getSampleRate()/2/numFrequencyUnit;  // frequency could be caught within the half of nSamples according to Nyquist theory

        // normalization of absoluteSpectrogram
        spectrogram=new double[numFrames][numFrequencyUnit];

        // set max and min amplitudes
        double maxAmp=Double.MIN_VALUE;
        double minAmp=Double.MAX_VALUE; 
        for (int i=0; i<numFrames; i++){
            for (int j=0; j<numFrequencyUnit; j++){
                if (absoluteSpectrogram[i][j]>maxAmp){
                    maxAmp=absoluteSpectrogram[i][j];
                }
                else if(absoluteSpectrogram[i][j]<minAmp){
                    minAmp=absoluteSpectrogram[i][j];
                }
            }
        }

thank you

1
Second dimension is frequency coefficients. Roughly speaking, FFT tells powers in some specific frequencies, and you can just "guess" what are powers in between them. For better resolution in frequency dimension you should use bigger FFT window (more samples).Display Name
then, how we determine the lowest frequency and highest frequency that exist on the spectrogram?wendy0402

1 Answers

1
votes

The spacing between each FFT result bin is the sample rate divided by the FFT length. For data sampled at a rate of 22050 sps fed to an FFT length of 1024, the resulting frequency bin spacing would be around 21.5 Hz. If you reduce the FFT length to 512, then the greater bin spacing results in less total bins in your spectrogram's vertical axis before it reaches no more than half the sample rate. With a Blackman window (actually any window), there will be some overlap of each bin's bandwidth.