sound classfication using mfcc and dynamic time warping (dtw)

Question

My objective is to classify non-speech signal for which I am using mfcc and dtw in java. However I am stuck in middle. I would appreciate any help. I have evaluated 13 mfcc values for each frame however some values are negative, I am confused whether the process I am following is right or wrong. Currently I am using the code provided by JAudio. I have also tried other code, they give me negative values as well.

Secondly, I get 13 coefficients for each frame, considering 157 frames for a certain length of sample, I get 157 sets of 13 mfccs. I am having hard time how to use all the coefficients in DTW because dtw only gives closest distance between two time signals. I do have code of DTW to compare two time signals. I am not sure how to use all the mfccs values of the signal as features.

Is there some crucial step of classification I am missing? Please help me.

Ioanna Ioanna · Accepted Answer · 2013-03-27T11:58:26

Say you have N1 sets of 13 MFCCs each for the first signal and N2 sets of MFCCs for the second. You should compute the distance between each set in from the first signal and each set from the second (you can use the Euclidian Distance for the distance between two 13-sized arrays)

This would leave you with an N1xN2 bidimensional array on which you should now apply the DTW.

sound classfication using mfcc and dynamic time warping (dtw)

3 Answers