0
votes

I am brand new to WEKA and ML, so please excuse my ignorance with the following. I've wasted several hours trying to figure it out, so hopefully someone could point me in the right direction:

I am trying to run a J48 decision tree on data for USDJPY. The data was loaded via .csv file and the class value is of nominal type, more specifically a value of TRUE or FALSE if USDJPY was trading more than 1% higher after 20 sessions. The problem is, When I run the algorithm, the decision tree is simply using the class value to solve the problem, which is useless. There are *22 attributes other than the class attribute from which I am looking to predict the class attribute.

When comparing my dataset to the example "glass" dataset, I cannot find any difference between the two that would explain my problem. "glass.arff" works as expected when I run J48 (with identical settings) by trying to predict the class value (type of glass) via the other attributes (ie it gets some guesses wrong).

What am I missing here? here is a list of the attributes:

@ATTRIBUTE date NUMERIC
@ATTRIBUTE open NUMERIC
@ATTRIBUTE high NUMERIC
@ATTRIBUTE low NUMERIC
@ATTRIBUTE close NUMERIC
@ATTRIBUTE 1daypctchg NUMERIC
@ATTRIBUTE smavg50onclose NUMERIC
@ATTRIBUTE smavg100onclose NUMERIC
@ATTRIBUTE smavg200onclose NUMERIC
@ATTRIBUTE ubb2 NUMERIC
@ATTRIBUTE bollma2 onclose NUMERIC
@ATTRIBUTE lbb2 NUMERIC
@ATTRIBUTE bollwjpybgn NUMERIC
@ATTRIBUTE %bjpybgn NUMERIC
@ATTRIBUTE rsi NUMERIC
@ATTRIBUTE ma50>100 {FALSE,TRUE}
@ATTRIBUTE ma50>200 {FALSE,TRUE}
@ATTRIBUTE ma100>200 {FALSE,TRUE}
@ATTRIBUTE up1pct5d? {FALSE,TRUE}
@ATTRIBUTE up1pct20d? {FALSE,TRUE}
@ATTRIBUTE dwn1pct5d? {FALSE,TRUE}
@ATTRIBUTE dwn1pct20d? {FALSE,TRUE}
1
Are you using the Weka UI or the Java API?stackoverflowuser2010
I am using the Weka UItrock2000
Are you marking the class column as the class in the UI? That will make the algorithm avoid using the class as a feature.stackoverflowuser2010
how do I do that? I thought that the last (right-most) column in your dataset defaults to the class? I also confirmed the right most column is bold in the preview window (if that means anything) - I even tried changing the class via the drop down menu in the preprocess and classify tabs - am I missing something?trock2000
Yes, the right-most column should be the class. If you followed all the steps to identify the correct column for the class, then I don't know what the problem is. Can you provide a link to the dataset?stackoverflowuser2010

1 Answers

1
votes

Weka (and its J48 implementation) should be able to classify your data as long as the ground-truth class is consistently in the same column of your .csv file.