0
votes

I'm trying to apply the Apriori algorithm in Weka.
Wikipedia has a simple example for that (Apriori algorithm):

alpha beta epsilon
alpha beta theta
alpha beta epsilon
alpha beta theta

The following association rules can be determined from this table:

  • 100% of sets with alpha also contain beta
  • 50% of sets with alpha, beta also have epsilon
  • 50% of sets with alpha, beta also have theta

I converted this entries into a CSV and added another attribute row, so that I finally had this file:

prod1,prod2,prod3
alpha,beta,epsilon
alpha,beta,theta
alpha,beta,epsilon
alpha,beta,theta

I loaded it into Weka and clicked the tab "Associate", the algorithm "Apriori" is selected per default.

As a result I get the following:

 1. prod2=beta 4 ==> prod1=alpha 4    conf:(1)
 2. prod1=alpha 4 ==> prod2=beta 4    conf:(1)
 3. prod3=epsilon 2 ==> prod1=alpha 2    conf:(1)
 4. prod3=theta 2 ==> prod1=alpha 2    conf:(1)
 5. prod3=epsilon 2 ==> prod2=beta 2    conf:(1)
 6. prod3=theta 2 ==> prod2=beta 2    conf:(1)
 7. prod2=beta prod3=epsilon 2 ==> prod1=alpha 2    conf:(1)
 8. prod1=alpha prod3=epsilon 2 ==> prod2=beta 2    conf:(1)
 9. prod3=epsilon 2 ==> prod1=alpha prod2=beta 2    conf:(1)
10. prod2=beta prod3=theta 2 ==> prod1=alpha 2    conf:(1)

But I also want the frequencies as in the example from Wikipedia (see above).

1

1 Answers

1
votes

Confidence that Weka gives you (conf:(1)) is exactly the "frequencies" that you want.

As you can see, your rule "50% of sets with alpha, beta also have epsilon" is not in the output of Weka. This is because Weka lists rules sorted by confidence in decreasing order. So, to list your 50% rule, you need to increase the number of outputs from Weka.

This can be done by increasing "numRules" (whose default value is 10, see the screenshot below). For your particular example you will also need to decrees "minMetric" from 0.9 to 0.5 (or lower).

Weka GUI Apriori parameters