I would like to use Apriori to carry out affinity analysis on transaction data. I have a table with a list of orders and their information. I mainly need to use the OrderID and ProductID attributes which are in the following format
OrderID ProductID
1 A
1 B
1 C
2 A
2 C
3 A
Weka requires you to create a nominal attribute for every product ID and to specify whether the item is present in the order using a true or false value like like this:
1, TRUE, TRUE, TRUE
2, TRUE, FALSE, TRUE
3, TRUE, FALSE, FALSE
My dataset contains about 10k records... about 3k different products. Can anyone suggest a way to create the dataset in this format? (Besides a manually time consuming way...)