I'm working on an implementation of the usual unification algorithm in the usual way: recursive descent through expression trees, adding variable bindings to a hash table along the way, doing the occurs check. In Java as it happens, using override functions to go with the grain of the language, so the part of the implementation dealing with variables is:
@Override
public boolean unify(Term a, Map<Var, Term> map) {
if (this == a) {
return true;
}
Term x = map.get(this);
if (x != null) {
return x.unify(a, map);
}
if (a instanceof Var) {
x = map.get((Var) a);
if (x != null) {
return x.unify(this, map);
}
}
if (a.occurs(this)) {
return false;
}
map.put(this, a);
return true;
}
This version is correct and for many cases quite fast, but it has a problem, which arises particularly when using it for type inference. When unifying lots of variables to the same target, it ends up with a set of bindings that basically looks like this:
a=b
b=c
c=d
d=e
Then each time a new variable must be unified to the same thing, it has to step through the chain one step at a time to find where it's at by now, which takes O(N) time, which means unifying a collection of variables to the same thing takes total time O(N^2).
Probably the best solution is to implement some kind of shortcut, something along the lines of updating a
to point directly to the current end target, whatever that may be. It's not entirely obvious how to do this in a way that will be correct and efficient in all cases.
Unification has been well known and quite widely used for decades, so I would imagine the solution to this must also have been known for decades, but the several discussions I've seen of unification, don't seem to mention it.
What exactly is the way to modify the algorithm to deal with it?