I have read many solution about this error. But my problem is definitely different from the others: I have a "train" dataset(arff) and a "test" dataset(arff), both these two arff have an attribute "id"(string). It works well if I 'remove' "id" of these two arff at the same time(if I don't remove the id in "test" I will get an error); what confuse me is that my friend can do it by remove only the "id" in "train", so his output will contains the "id".
(since he didn't remove the "id" in the "test", the number of attribute will not be the same, and this is against what I read that the number of attribute should be exactly the same).
I really need an output that can contain the "id". Maybe I did something wrong with the "remove"? I read somewhere said that the test feature may be superior to that of train. And also a paragraph talking about how to remove:"Instead of using a nominal ID attribute, declare it as STRING attribute. With this you don't have to declare each possible value like with NOMINAL attributes and it therefore doesn't matter what strings are used in the test set that you're trying to use the trained model on. In order to be able to work with this STRING ID attribute you have to use the FilteredClassifier in conjunction with the Remove filter (package weka.filters.unsupervised.attribute) and your original base classifier. This setup will remove the ID attribute for the learning process (i.e., the base classifier), but you'll still be able to use it outside for tracking instances. " http://weka.8497.n7.nabble.com/use-saved-model-td22857.html
Anyone have an idea?
Any help will be appreciated.
my 2 arff, left: train; right: test
left: output of myfriend with id such as test_subject1005 ; right: my output