I have tried Principal component analysis (PCA) for feature selection which gave me 4 optimal features from set of nine features (Mean of Green, Variance of Green, Std. div. of Green , Mean of Red, Variance of Red, Std. div. of Red, Mean of Hue, Variance of Hue, Std. div. of Hue, i.e. [ MGcorr,VarGcorr, stdGcorr,MRcorr,VarRcorr,stdRcorr,MHcorr,VarHcorr,stdHcorr ]) for classification of data into two clusters. From Literature, it seems that PCA is not very good method but rather is better to apply kernel PCA (KPCA) for the feature selection. I want to apply KPCA for feature selection and I have tried following:
d=4; % number of features to be selected, or d: reduced dimension
[Y2 eigVector para ]=kPCA(feature,d); % feature is 300X9 matrix with 300 as number of
% observation and 9 features
% Y: dimensionanlity-reduced data
Above kPCA.m function can be downloaded from : http://www.mathworks.com/matlabcentral/fileexchange/39715-kernel-pca-and-pre-image-reconstruction/content/kPCA_v1.0/code/kPCA.m
In above implementation I want to know how to find which 4 features from 9 features to select (i.e. which top features are optimal) for clustering.
Alternatively, I also tried following function for KPCA implementation:
options.KernelType = 'Gaussian';
options.t = 1;
options.ReducedDim = 4;
[eigvector, eigvalue] = KPCA(feature', options);
In above implementation also I have same problem in determining the 4 top /optimal features from set of 9 features.
Above KPCA.m function can be downloaded from : http://www.cad.zju.edu.cn/home/dengcai/Data/code/KPCA.m
That will be great if someone will help me in implementing kernel PCA for my problem.
Thanks