I have two .wav audio files recorded simultaneously (outdoor mics for bioacoustics pilot study). A bird flying over chirps, and both mics detect the bird, but at different time points.
A common task is to cross-correlate the two signals and find the peak cross-correlation which indicates the time lag between the signal arriving at one microphone vs. the other. I find code for doing exactly this here Find time shift of two signals using cross correlation
However, that post seems to assume that people know how to get their audio files into a useful format for this analysis. Basic attempts to just use my whole wav files as y1 and y2 fail on account of the data not being in a correct format
TypeError: ufunc 'multiply' did not contain a loop with signature matching types dtype('<U32') dtype('<U32') dtype('<U32')
I started looking around at how to turn a .wav file into a numpy array but got errors and didn't really know what I was doing. I assume it has something to do with doing FFT and turning the audio file into an image (spectrogram) for each audio file, and those image arrays are the y1 and y2 in the example above. I assume this link is talking about that. FFT-based 2D convolution and correlation in Python
What is the right way to proceed? Thank you very much.
TLDR. How should I import and modify two locally saved .wav files to prepare them for finding peak time lag by cross-correlation?
numpy
, there's no workaround to that. Reading in audio in wave format is explained / demonstrated in thousands of places on the internet. You might want to check e.g.scipy
orlibrosa
. I am doing very similar stuff with RSPB, here's how I read in files in this example: github.com/tracek/audio-explorer/blob/master/audiocli.py#L102 Check the docs for theload
function. – Lukasz Tracewski