4
votes

I am trying to use the NumPy library for Python to do some frequency analysis. I have two .wav files that both contain a 440 Hz sine wave. One of them I generated with the NumPy sine function, and the other I generated in Audacity. The FFT works on the Python-generated one, but does nothing on the Audacity one.

Here are links to the two files:

The non-working file: 440_audacity.wav

The working file: 440_gen.wav

This is the code I am using to do the Fourier transform:

import numpy as np
import matplotlib.pyplot as plt
import scipy.io.wavfile as wave

infile = "440_gen.wav"
rate, data = wave.read(infile)

data = np.array(data)

data_fft = np.fft.fft(data)
frequencies = np.abs(data_fft)

plt.subplot(2,1,1)
plt.plot(data[:800])
plt.title("Original wave: " + infile)

plt.subplot(2,1,2)
plt.plot(frequencies)
plt.title("Fourier transform results")

plt.xlim(0, 1000)

plt.tight_layout()

plt.show()

I have two 16-bit PCM .wav files, one from Audacity and one created with the NumPy sine function. The NumPy-generated one gives the following (correct) result, with the spike at 440Hz: FFT on the numpy generated file

The one I created with Audacity, although the waveform appears identical, does not give any result on the Fourier transform: FFT on the audacity generated file

I admit I am at a loss here. The two files should contain in effect the same data. They are encoded the same way, and the wave forms appear identical on the upper graph.

Here is the code used to generate the working file:

import numpy as np
import wave
import struct
import matplotlib.pyplot as plt
from operator import add

freq_one = 440.0
num_samples = 44100
sample_rate = 44100.0
amplitude = 12800

file = "440_gen.wav"

s1 = [np.sin(2 * np.pi * freq_one * x/sample_rate) * amplitude for x in range(num_samples)]

sine_one = np.array(s1)

nframes = num_samples
comptype = "NONE"
compname="not compressed"
nchannels = 1
sampwidth = 2

wav_file = wave.open(file, 'w')
wav_file.setparams((nchannels, sampwidth, int(sample_rate), nframes, comptype, compname))

for s in sine_one:
    wav_file.writeframes(struct.pack('h', int(s)))
2
Is there any chance you could share 440_gen.wav?jwalton
try record with aplay 440_gen.wav in the terminal and see if wav file is correct.Angelo Mendes
@Ralph I'm working on uploading both files to attach.darkfire613
@darkfire613 Thanks. Just the file generated by audacity should do it since you included the code to produce the python generated .wav. Also, it's not true that the "wave forms appear identical on the upper graph". Though their frequencies are the same the amplitudes are not (although this shouldn't matter and I don't think this is the problem)jwalton
@Ralph I have added Dropbox links to both files.darkfire613

2 Answers

1
votes

The following worked for me and produced the plots as expected:

import numpy as np
import matplotlib.pyplot as plt
import scipy.io.wavfile as wave

infile = "440_gen.wav"
rate, data = wave.read(infile)

data = np.array(data)

# Use first 44100 datapoints in transform
data_fft = np.fft.fft(data[:44100])
frequencies = np.abs(data_fft)

plt.subplot(2,1,1)
plt.plot(data[:800])
plt.title("Original wave: " + infile)

plt.subplot(2,1,2)
plt.plot(frequencies)
plt.title("Fourier transform results")

plt.xlim(0, 1000)

plt.tight_layout()

plt.show()

enter image description here

enter image description here

2
votes

Let me explain why your code doesn't work. And why it works with [:44100].

First of all, you have different files:

440_gen.wav      = 1 sec and 44100  samples (counts)        
440_audacity.wav = 5 sec and 220500 samples (counts)

Since for 440_gen.wav in FFT you use the number of reference points N=44100 and the sample rate 44100, your frequency resolution is 1 Hz (bins are followed in 1 Hz increments).
Therefore, on the graph, each FFT sample corresponds to a delta equal to 1 Hz.
plt.xlim(0, 1000) just corresponds to the range 0-1000 Hz.

However, for 440_audacity.wav in FFT, you use the number of reference points N=220500 and the sample rate 44100. Your frequency resolution is 0.2 Hz (bins follow in 0.2 Hz increments) - on the graph, each FFT sample corresponds to a frequency in 0.2 Hz increments (min-max = +(-) 22500 Hz).
plt.xlim(0, 1000) just corresponds to the range 1000x0.2 = 0-200 Hz.
That is why the result is not visible - it does not fall within this range.

plt.xlim (0, 5000) will correct your situation and extend the range to 0-1000 Hz.

The solution [:44100] that jwalton brought in really only forces the FFT to use N = 44100. And this repeats the situation with the calculation for 440_gen.wav

A more correct solution to your problem is to use the N (Windows Size) parameter in the code and the np.fft.fftfreq() function.

Sample code below.

I also recommend an excellent article https://realpython.com/python-scipy-fft/

import numpy as np
import matplotlib.pyplot as plt
import scipy.io.wavfile as wave

N = 44100 # added

infile = "440_audacity.wav"
rate, data = wave.read(infile)

data = np.array(data)

data_fft = np.fft.fft(data, N)  # added N
frequencies = np.abs(data_fft)
x_freq = np.fft.fftfreq(N, 1/44100)  # added

plt.subplot(2,1,1)
plt.plot(data[:800])
plt.title("Original wave: " + infile)

plt.subplot(2,1,2)
plt.plot(x_freq, frequencies)  # added x_freq 
plt.title("Fourier transform results")

plt.xlim(0, 1000)
plt.tight_layout()
plt.show()