Perfect decision tree classification

Question

Imagine that the universe of all known mappings between the values of a set of variables V and a set of tag names T (classification labels) was known. Further, assume that the total space of unique variable value combinations is large (> 100B points), the size of the tag set is relatively small (thousands of elements) and the number of variables is very small (4-10).

What is a algorithm for building a classifier function that provides a perfect mapping (matching the a priori knowledge with no false positives or false negatives) from variable values to labels with the following space and time complexity goals:

Time complexity lower than O(|V|*log|T|)
Space complexity less than O(|V|^k), k ≤ e

Or, rephrased as a decision tree problem:

How can a decision tree algorithm be tuned to create a perfect mapping?
How can the training data be represented efficiently to guarantee that?

How is the prior knowledge represented? How are you able to say to which class any instance belongs, without enumerating all possibilities? It seems to me that this is the classification rule you should be using, not trying to get a decision tree to match this prior knowledge. Perhaps if you explain the context? — Ben Allison
@BenAllison very good question. The "class" space is generated by the output of the following question: stackoverflow.com/questions/15803808/… — Sim

Lars Kotthoff Lars Kotthoff · Accepted Answer · 2013-04-04T08:28:20

What you are trying to achieve should be possible with any decision tree classifier that allows you to specify the level of pruning in some way. The idea is to make it not do any pruning at all. The decision tree you would end up with would have (potentially) one leaf per training instance (i.e. very large) but would give you "perfect" accuracy with a prediction time O(|V|*log|T|).

This is completely independent of how the training data is represented (and should be). The only thing that matters is that the decision tree inducer can read and process it. One simply way of building such a tree is to add a path for the first example, then merge in one for the second and so on.

Whether such a classifier would be useful in practice is a completely different question of course -- in most cases it won't be.

Perfect decision tree classification

1 Answers