I'm using Weka to pre-process dataset. The problem is that I have an attribute 'medical speciality' that contains a lot of labels more than 70 so by exploding it (change it from nominal to binary), I got 70 more attributes in the data set. So I found a way to minimize this number as much as possible:
here is an example of the values of labels to understand the idea:
*Pediatrics
*Pediatrics-Endocrnology
*Endocrinology
So I need to keep only Pediatrics and Endocrinology as attributes and for the instances in Pediatrics-Endocrnology they will have a 1 in Pediatrics and 1 in Endocrinology.
How can I do that with weka ?? Any suggestion ??