My objective is to classify non-speech signal for which I am using mfcc and dtw in java. However I am stuck in middle. I would appreciate any help. I have evaluated 13 mfcc values for each frame however some values are negative, I am confused whether the process I am following is right or wrong. Currently I am using the code provided by JAudio. I have also tried other code, they give me negative values as well.
Secondly, I get 13 coefficients for each frame, considering 157 frames for a certain length of sample, I get 157 sets of 13 mfccs. I am having hard time how to use all the coefficients in DTW because dtw only gives closest distance between two time signals. I do have code of DTW to compare two time signals. I am not sure how to use all the mfccs values of the signal as features.
Is there some crucial step of classification I am missing? Please help me.