Yep, I've thought about this too. Basically, the idea is that it seems like you could implement Haskell 98, but not some of the language extensions to it, using polymorphism-by-multiinstantiation instead of polymorphism-by-boxing.
You can get some insight into this by trying to implement some Haskell features as C++ libraries (as you note, C++ does polymorphism-by-multiinstatiation). What you find is that you can do everything that Haskell can do, except that it's impossible to have polymorphic values, which includes references to polymorphic functions.
What this looks like is that if you have
template<typename T>
void f(T); // f :: a -> IO ()
you can take the address of a particular instantiation to pass around as a function pointer at runtime:
&f<int>
but you cannot take the address of a template (&f
). This makes sense: templates are a purely compile-time construct. It also makes sense that if you're doing polymorphism by multiinstantiation, you can have a pointer to any particular instantiation, but you cannot have a pointer to the polymorphic function itself, because at the machine code level, there isn't one.
So where does Haskell use polymorphic values? At first glance it seems like a good rule of thumb of is "anywhere you have to write an explicit forall". So PolymorphicComponents
, Rank2Types
, RankNTypes
, and ImpredicativeTypes
are obvious no-nos. You can't translate this to C++:
data MkList = MkList (forall a. a -> [a])
singleton = MkList (\x -> [x])
On the other hand, ExistentialQuantification
is doable in at least some cases: it means having a non-template class with a template constructor (or more generally, a class whose constructor is templated on more things than the class itself).
If in Haskell you have:
data SomeShow = forall a. Show a => SomeShow a
instance Show SomeShow where show (SomeShow a) = show a
you can implement this in C++ as:
// a function which takes a void*, casts it to the given type, and
// calls the appropriate show() function (statically selected based
// on overload resolution rules)
template<typename T>
String showVoid(void *x)
{
show(*(T*)x);
}
class SomeShow
{
private:
void *m_data;
String (*m_show)(void*); // m_show :: Any -> String
public:
template<typename T>
SomeShow(T x)
: m_data(new T(x)) // memory management issues here, but that's orthogonal
, m_show(&showVoid<T>)
{
}
String show()
{
// alternately we could declare the top-level show() as a friend and
// put this there
return m_show(m_data);
}
};
// C++ doesn't have type classes per se, but it has overloading, which means
// that interfaces are implicit: where in Haskell you would write a class and
// instances, in C++ you just write a function with the same name for each type
String show(SomeShow x)
{
return x.show();
}
In both languages you have a non-polymorphic type with a polymorphic constructor.
So we have shown that there are some language extensions you can implement and some you can't, but what about the other side of the coin: is there anything in Haskell 98 that you can't implement? Judging by the fact that you need a language extension (ExplicitForAll
) to even write a forall, you would think that the answer is no. And you would almost be right, but there's two wrinkles: type classes and polymorphic recursion. Type classes are typically implemented using dictionary passing: each instance declaration results in a record of functions, which are implicitly passed around wherever they're needed.
So for Monad for example you would have:
data MonadDict m = MonadDict {
return :: forall a. a -> m a,
(>>=) :: forall a b. m a -> (a -> m b) -> m b
}
Well would you look at those foralls! You can't write them explicitly, but in dictionary passing implementations, even in Haskell 98, classes with polymorphic methods result in records containing polymorphic functions. Which if you're trying to implement the whole thing using multiinstantion is obviously going to be a problem. You can almost get away without dictionary passing because, if you stick to Haskell 98, instances are almost always global and statically known. Each instance results in some polymorphic functions, but because which one to call is almost always known at compile time, you almost never need to pass references to them around at runtime (which is good, because you can't). The tradeoff is that you need to do whole-program compilation, because otherwise instances are no longer statically known: they might be in a different module. And the exception is polymorphic recursion, which practically requires you to build up a dictionary at runtime. See the other answer for more details on that. Polymorphic recursion kills the multiinstantiation approach even without type classes: see the comment about BTree
s. (Also ExistentialQuantification
*plus* classes with polymorphic methods is no longer doable, because you would have to again start storing pointers to polymorphic functions.)