This is a revised answer.
Class unions are funky -- they effectively insert a class into the middle of an existing hierarchy, so suddenly things that extend list
now extend listOrNULL
.
Instead I'd create a small class hierarchy that represents a "Tree" that can either be "Empty" or "Internal". The "Internal" class will have a slot to contain data (of type "ANY"), plus the left and right links, which will be "Tree" elements.
setClass("Tree")
setClass("Empty", contains="Tree")
setClass("Internal", contains="Tree",
representation=representation(elem="ANY", left="Tree", right="Tree"),
prototype=prototype(left=new("Empty"), right=new("Empty")))
I'll write a constructor for my Tree, with methods for creating an empty tree, and a tree from an element plus left and right descendants.
setGeneric("Tree", function(elem, left, right) standardGeneric("Tree"),
signature="elem")
setMethod(Tree, "missing", function(elem, left, right) new("Empty"))
setMethod(Tree, "ANY", function(elem, left, right) {
new("Internal", elem=elem, left=left, right=right)
})
A basic operation is to insert an element x
into a tree t
setGeneric("insert", function(x, t) standardGeneric("insert"))
setMethod(insert, c("ANY", "Empty"), function(x, t) {
Tree(x, Tree(), Tree())
})
setMethod(insert, c("ANY", "Internal"), function(x, t) {
if (x < t@elem) {
l <- insert(x, t@left)
r <- t@right
} else {
l <- t@left
r <- insert(x, t@right)
}
Tree(t@elem, l, r)
})
Another operation is to test for membership
setGeneric("member", function(x, t) standardGeneric("member"))
setMethod(member, c("ANY", "Empty"), function(x, t) FALSE)
setMethod(member, c("ANY", "Internal"), function(x, t) {
if (x < t@elem) member(x, t@left)
else if (t@elem < x) member(x, t@right)
else TRUE
})
An interesting, functional, property of this implementation is that it is persistent
> t <- Tree()
> t1 <- insert(10, t)
> t2 <- insert(5, t1)
> t3 <- insert(7, t2)
> t4 <- insert(15, t3)
> which(sapply(1:20, member, t4))
[1] 5 7 10 15
> which(sapply(1:20, member, t2))
[1] 5 10
This is not going to be efficient when there are a lot of updates, because of the inefficiencies of S4 class creation and because modifying a tree (e.g., adding a node) copies all nodes in the path to the new node. A different approach represents the tree as a matrix
of left, right, value triples. The S4 implementation would still have poor performance, because the updates of the instance would create new instances, duplicating everything. So I'd end up at a reference class, with fields 'value' (a vector of whatever the tree is supposed to hold and a matrix
of left and right relations.