I have this data set:
Instance num 0 : 300,24,'Social worker','Computer sciences',Music,10,5,5,1,5,''
Instance num 1 : 1000,20,Student,'Computer engineering',Education,10,5,5,5,5,Sony
Instance num 2 : 450,28,'Computer support specialist',Business,Programming,10,4,1,0,4,Lenovo
Instance num 3 : 1000,20,Student,'Computer engineering','3d Design',1,1,2,1,3,Toshiba
Instance num 4 : 1000,20,Student,'Computer engineering',Programming,2,5,1,5,4,Dell
Instance num 5 : 800,16,Student,'Computer sciences',Education,8,4,3,4,4,Toshiba
and I want to classify using SMO and other multi-class classifiers so I convert all the nominal values to numeric using this code :
int [] indices={2,3,4,10}; // indices of nominal columns
for (int i = 0; i < indices.length; i++) {
int attInd = indices[i];
Attribute att = data.attribute(attInd);
for (int n = 0; n < att.numValues(); n++) {
data.renameAttributeValue(att, att.value(n), "" + n);
}
}
and the result is:
Instance num 0 : 300,24,0,0,0,10,5,5,1,5,0
Instance num 1 : 1000,20,1,1,1,10,5,5,5,5,1
Instance num 2 : 450,28,2,2,2,10,4,1,0,4,2
Instance num 3 : 1000,20,1,1,3,1,1,2,1,3,3
Instance num 4 : 1000,20,1,1,2,2,5,1,5,4,4
Instance num 5 : 800,16,1,0,1,8,4,3,4,4,3
after applying the "Normalize" filter the result will be like this:
Instance num 0 : 0,0.666667,0,0,0,1,1,1,0.2,1,0
Instance num 1 : 1,0.333333,1,1,1,1,1,1,1,1,1
Instance num 2 : 0.214286,1,2,2,2,1,0.75,0,0,0.5,2
Instance num 3 : 1,0.333333,1,1,3,0,0,0.25,0.2,0,3
Instance num 4 : 1,0.333333,1,1,2,0.111111,1,0,1,0.5,4
Instance num 5 : 0.714286,0,1,0,1,0.777778,0.75,0.5,0.8,0.5,3
the problem is the converted columns still in String "Normalize" filter will not normalize them...
Any ideas?
and my second question: what should I use as multi-class classifier beside SMO?