I'm looking for a proper name for a data type that combines ADT with explicit subtyping.
In one of my applications, I use a structure similar to ADT to represent parse trees, on which I perform recursive pattern matching. I find it rather convenient if I could combine ADT with subtyping, as demonstrated in the example below:
Note: the example is written in Haskell's syntax, but this is not Haskell code.
data Empty = Empty
data Expr = Int Int | Add Expr AddOp Expr
data OptionalExpr =
| Empty // I want to make Empty a subtype of OptionalExpr
| Expr // I want to make Expr a subtype of OptionalExpr
In the example above, I first define 2 types: Empty and Expr. Then I make these 2 types the subtype of OptionalExpr. I realize this kind of data type is uncommon. Apparently neither Haskell nor OCaml support it. But I don't know about other functional languages.
I am looking for something that combines ADT with explicit subtyping, not structurally-implied subtyping as in polymorphic variant. There are a few justifications for this requirement:
- First, we want all-or-none subtyping. Say we want A to be a subtype of B, then we will never want to include only some of the variants of A under B. Either A is a subtype of B, in which case B includes all the variants of A, or A is not a subtype of B, in which case B includes none of the variants of A. We don't allow gray-area in between.
- Second, we don't want B to be open in any sense. We have in mind a very specific set of subtypes of B. We don't want something to become an instance of B just by implementing a typeclass or the like.
- Third, say type A has a large number of variants. We want to make type B a supertype of A. Copying all the variants into B, as is required with polymorphic variant, is just too cumbersome and error-prone.
- Fourth, we don't want to introduce new value-constructors when all we want to express is a subtype. In the example above, we could have written OptionalExpr as an ADT with 2 value-constructors, like this:
data OptionalExpr = EmptyExpr | NonEmptyExpr Expr, or we could have usedMaybe, but in my application this is unacceptable, because the level of embedding can be quite deep, and it would be a nightmare to deconstruct an deeply embedded value like(L1 (L2 (L3 (L4 (L5 value_wanted))))).
To give you some idea why such requirements exist, I show a more specific example below:
PrimaryExpr = ID | LeftParen Expr RightParen
UnaryExpr = PrimaryExpr | NegateOp PrimaryExpr // -
MultExpr = UnaryExpr | MultExpr MultOp UnaryExpr // *
AddExpr = MultExpr | AddExpr AddOp MultExpr // +
CompExpr = AddExpr | AddExpr CompOp AddExpr
Expr = CompExpr
The above example expresses a subtype hierarchy, and expresses ideas such as AddExpr is a CompExpr, but a CompExpr is not an AddExpr. For this specific example, some people have suggested to me that I can replace UnaryExpr, MultExpr, AddExpr and so on with just Expr. That is, I can define all the types as a single type. That loses type constraints such as CompExpr is not AddExpr, and because I'm doing recursive pattern matching on these types, I need that constraints of this hierarchy to be statically enforced.
Is there a name for this kind of data type I'm looking for in the literature? Or am I looking for something that doesn't even make sense? If you think this is the case, why am I looking for something nonsensical? Thanks for any pointers.
EDIT: even though I've written the above code snippets in Haskell's syntax, I am not writing my application in Haskell. I'm using my own language and my own data types, so I am not limited by Haskell's semantics. I am looking for a pointer to similar concepts in literature, so that when I write a report for my project I don't appear to be reinventing something new. I tried all the google keywords I can think of and nothing right was returned so I'm asking here.