In frequent itemset generation of association rule mining, what is the fundamental difference between maximal & closed patterns itemsets. Can someone guide me a resource about them?
4 Answers
From this original source:
A closed pattern is a frequent pattern. So it meets the minimum support criteria. In addition to that, all super-patterns of a closed pattern are less frequent than the closed pattern.
Let’s see some examples.
Suppose, the minimum support count is 2. For the first example, suppose there are a total of 3 items: a, b, c. Suppose a pattern ab has support count of 2 and a pattern abc has support count of 2. Is the pattern ab is a closed pattern? Pattern ab is a frequent pattern, but it has a super-pattern that is NOT less frequent than ab.
For the second example,
suppose there are a total of 3 items: x, y, z. suppose a pattern xy has support count of 3 and a pattern xyz has support count of 2. Is the pattern xy is a closed pattern? Pattern xy is a frequent pattern and also the only super-pattern xyz is less frequent than xy.
Therefore, xy is a closed pattern.
A max pattern is
a frequent pattern. So it also meets the minimum support criteria like closed pattern In addition, but unlike closed pattern, all super-patterns of a max pattern are NOT frequent patterns.
Let’s see some examples as well.
Suppose, the minimum support count is 2. Like before, for the first example, suppose there are a total of 3 items: a, b, c. Suppose a pattern ab has support count of 3 and a pattern abc has support count of 2. Is the pattern ab is a max pattern? Pattern ab is a frequent pattern, but it has a super-pattern that is a frequent pattern as well. So, pattern ab is NOT a max pattern.
For the second example,
suppose there are a total of 3 items: x, y, z. Suppose a pattern xy has support count of 3 and a pattern xyz has support count of 1. Is the pattern xy is a max pattern? Pattern xy is a frequent pattern and also the only super-pattern xyz is NOT a frequent pattern. Therefore, xy is a max pattern.
In frequent itemset mining:
- A maximal itemset is an itemset that has no superset that is frequent.
- A closed itemset is an itemset that has no superset that has the same support.
Maximal itemsets are a subset of the set of closed itemsets, which are a subset of all frequent itemsets.
You can get implementations of closed and maximal itemset mining algorithms (FPMax, FPClosed, DCI_Closed, CHarm, etc.) with examples as part of the SPMF data mining library. (I'm the author)
In frequent itemset mining:
X is said to be a max-pattern if X is a frequent pattern and there exists no frequent super pattern Y (where Y is a super set of X). Max Patterns are lossy forms of compression as the underlying support information is lost.
On the other hand, X is said to be a closed-pattern if X is frequent and there exits no super pattern Y (where Y is a super set of X) with the same support as X. Closed Patterns are lossless forms of compression, as the support information is stored within the pattern.
i think your question addresses maximal frequent itemset and closed frequent itemset.
The main difference between them are
@maximal frequent itemset does not provide the support count of their subsets.
@closed frequent itemset preserves the support count of its itemset.
you can refer the following link for better understanding about association mining and maximal and closed frequent itemset.