3
votes

I have a dataset with mostly integer values. I want to apply association rule mining on it. I have taken a look at the popular algorithms like Apriori, etc. but all of them work on data which have boolean values, i.e., either the item exists in the transaction or doesn't.

Is there an algorithm which lets us account for values of the attributes in addition to their counts? (I plan to normalize the data to have values between 0 and 1)

3

3 Answers

1
votes

You can "hack" around this limitation if your nubers are integer (why normalize to 0 1?) and small:

apple banana apple

becomes

apple banana apple_2

which would allow to find association rules like

banana => apple, apple_2

but you need to mix in some clever filters to not get useless rules like

apple_2 => apple
1
votes

Item-item collaborative filtering is quite similar to similarity-based data mining techniques like association rule mining. Moreover, collaborative filtering was built to handle continuous and ordinal values, such as star ratings or a Likert scale: this is usually preference information from users.

Content-based filtering is probably your best bet for the situation you describe. It allows for item attributes and weights (that do not change per user for that item), then takes in user preference for each item (that does change per user for that item).

If you want both preference (counts) and attributes to change for each user-item pair, I don't know of an algorithm that handles that. Usually algorithms are built for one input per user-item pair.

0
votes

Yes. There are some variations of the itemset mining problem that will let you specify additional information. For example, high utility itemset mining algorithms let you specify a quantity for each item occuring in a transaction, as well as a weight for each item.