using mapreduce programming technique in matlab

Question

I am studying rat ultrasonic vocalisations (their speech in ultrasound). I have several audio wav files of the rats speeches. Ideally, I would import the whole file into matlab and just process it but I will get memory issues even with the smallest 70mb file. This is what I want help with.

[y, Fs, nbits] = audioread('T0000201.wav');

[S F T] = spectrogram(y,100,[],256,Fs,'yaxis'); .. .. ..rest of program

I could consider breaking the audio (in one file) into blocks, and process the block before considering the next block, but I'm not sure what I would do for cases where rat calls are cut off half way through, at the end of the blocks (this might have a negative impact on the STFT spectrogram).

I came across another technique called "Mapreduce" which seems to allow me to use the entirety of my data without actually reading it in. While this seems most ideal, I don't quite understand how it works or can be implemented. "Hadoop" has also been mentioned. Can anyone provide any assistance?

I am currently using this (http://uk.mathworks.com/help/matlab/import_export/find-maximum-value-with-mapreduce.html) for reference. My first step was trying to use the wav file as the data store (like the csv file in the example) but that didn't work.

welcome to Stackoverflow, and please pay more attention when picking tags next time you post a question. When you type a tag, it explains what it's for, and you clearly didn't read those descriptions (you picked "Processing" and "Signals", which are for a programming language called Processing, and for code that deals with hardware interrupts, respectively) — Mike 'Pomax' Kamermans

Josh Meyer Josh Meyer · Accepted Answer · 2015-07-02T04:13:08

Since you're working primarily with a repository of audio (.wav) files, mapreduce might not be your best option. The datastore function only works with text files or key-value files.

Use the memory function to explore what the limits of memory are for MATLAB, and try processing the audio files in smaller blocks as you mentioned. Using a combination of audioread(), audioinfo(), and audiowrite(), you can break your collection of audio files up into a larger collection of smaller files that can then be individually processed.

If you have a small number of files to work with, then you can manually inspect the smaller blocks to make sure no important rat calls are cut off between blocks. Of course if you have thousands of files to work with then that approach won't be feasible.

using mapreduce programming technique in matlab

1 Answers