1
votes

I am trying to do apriori association mining with WEKA (i use 3.7) using given database table enter image description here

So, i exported two columns (orderLineNumber and productCode) and load it into weka, as far as i go, i haven't got any success attempt, always ended with "No large itemsets and rules found!"

Again, i tried to convert the csv into ARFF file first using ARFF Converter and still get the same message;

enter image description here

I also tried using database loader in WEKA, the data loaded just fine but still give the same result;

The filter i've applied in preprocessing is only numericToNominal filter;

What have i wrongly done here, i suspiciously think it was my ARFF format though, thank you

Update After further trial, i found out that i exported wrong column and i lack 1 filter process, which is "denormalized", i installed the plugin via packet manager and denormalized my data after converting it to nominal first;

I then compared the results with "Supermarket" sample's result; The only difference are my output came with 'f' instead of 't' (like shown below) and the confidence value seems like always 100%;

enter image description here enter image description here

1

1 Answers

3
votes

First of all, OrderLine is the wrong column.

Obviously, the position on the printed bill is not very important.

Secondly, the file format is not appropriate.

You want one line for every order, one column for every possible item in the @data section. To save memory, it may be helpful to use sparse formats (do not forget to set flags appropriately)

Other tools like ELKI can process input formats like this, that may be easier to use (it also was a lot faster than Weka):

apple banana
milk diapers beer

but last I checked, ELKI would "only" find frequent itemsets (the harder part) not compute association rules. I then used a tiny python script to produce actual association rules as desired.