0
votes

I have downloaded a dataset from the uci archive called mammographic mass data. I have saved the file into edexcel and then saved as a .csv file. The attribute information for the data set is:

Attribute Information:

  1. BI-RADS assessment: 1 to 5 (ordinal)

  2. Age: patient's age in years (integer)

  3. Shape: mass shape: round=1 oval=2 lobular=3 irregular=4 (nominal)

  4. Margin: mass margin: circumscribed=1 microlobulated=2 obscured=3 ill- defined=4 spiculated=5 (nominal)
  5. Density: mass density high=1 iso=2 low=3 fat-containing=4 (ordinal)
  6. Severity: benign=0 or malignant=1 (binominal)

I open the file in the experiment environment and try to run however I get the following error message:

13:01:56: Started

13:01:56: Class attribute is not nominal!

13:01:56: Interrupted

13:01:56: There was 1 error

I have tried changing the attribute to class in the explorer but that has not worked. Any suggestions would be great :)

1
I assume you're trying to build a classification model that predicts Severity from the other attributes? If so you probably need to convert the Severity attribute from numeric to nominal using the appropriate filter in the Preprocess pane. - nekomatic
If @nekomatic doesn't work, open the CSV file in Excel and change 0 to benign and 1 to malignant. This will now be read as nominal when you import the CSV file. - zbicyclist
Please provide your training file explicitly. It will help to identify sure reason. From now it seems that you have not set class attribute before starting the training. Before starting the training weka must know which is the class attribute. I don't think there is need to convert severity attribute either to integers or to text classes. Whatever is fine as long as attributes possible values are mentioned in arff structure. So please let us know u r training file explicitly - drp
Ive now figured out my problem. The class attribute have values of 0 and 1. I needed to change them to true and false in edexcel. Thanks for all the responses - Daniel

1 Answers

0
votes

What you need is a Filter, more specifically a Descritize filter, to preprocess your data.

For example, assuming ins is the instances object where your data set is stored. The following code shows how to use a filter.

Discretize filter = new Discretize();
filter.setOptions(...); // set options
filter.setInputFormat(ins);
ins = Filter.useFilter(ins, filter);