2
votes

I'm trying to represent a set of productions of a grammar in OCaml and existential types would be very useful to model the semantic actions of grammar rules. I've been looking into Menhir source code and existential types are also used to model the semantic actions. Consider the following:

type 'a 'b production = { name: string; rule: 'a expr; action: 'a -> 'b }

A production has a name, a rule that returns 'a and an action that receives 'a and returns 'b. The problem with simple parametric types is that a list of productions could not be polymorphic at all, so I can have an action of string_of_float and another of string_of_int in the same list; which is a common issue that can be solved with explicit , for example, action could be:

action: 'a 'b. 'a -> 'b

But I also need to have a constraint based on the parameter of expr, where 'a of action unifies to 'a of expr, so, as long as we can have inline record constructors in algebraic data types, we should be able to have them in generalized algebraic data types too, right?

type production = (* Ɐ a. b. *)
  | Production: { name: string; rule: 'a expr; action: 'a -> 'b } -> production

But this is not a valid form. The solution I found for now is making the rule also have the action type, for example:

type 'a expr =
  | Terminal of string * ('a -> 'a)
  | Sequence of 'a expr list * ('a list -> 'a)

But this kind of monomorphism is still very restrictive. How can I model existential types inside records with shared constraints, so I could have different return types in different productions where a compile-time checking of application is still possible (via exhaustive pattern-matching)?

Using the same logic, If I'd like to have a list of functions from 'a 'b. -> 'a -> 'b, would I need to use an existential type constructor to do that (like in [string_of_int; string_of_float])?

I believe I need some sort of kind restriction (they both are * -> *), but I came from Haskell and still didn't figure out how to do that in OCaml.

2

2 Answers

4
votes

I am not sure what you want. Supposing that there is a function 'a expr -> 'a, if you want to have a list of production with different inner expressions, you could define a production as

type 'r production = 
 | P: { name: string; rule: 'a expr; action: 'a -> 'r } -> 'r production

then you you could have a list of [P string_of_float; P string_of_int] while still knowing the result type of the action. (Moreover, your previous definition is valid.)

The other universal quantifications strike me as problematic: the only functions with type ∀'a∀'b. 'a -> 'b are variants of

let fail _ = assert false

Similarly, existentially quantifying the return type of action means that you can never recover the information about what it is this return type, in other words, it is as useful as having only action of type 'a -> unit.

1
votes

A semantic action of a production rule should have some effect otherwise there is no need in the action at all. You can encode this effect either using some sort of a state monad or directly in OCaml, i.e., as a mutable reference or direct I/O.

In your case, the existential

type production = (* Ɐ a. b. *)
  | Production: { name: string; rule: 'a expr; action: 'a -> 'b } -> production

has an effect of producing values of type 'b. That directly contradicts the definition of existential as in that case b would need to escape the context. So the proper definition would be

type production = (* Ɐ a. *)
  | Production: { name: string; rule: 'a expr; action: 'a -> unit }

or, if you're willing to use some monad, then you need to put the monad constructor in the prenex position

type 'm production = (* Ɐ a. *)
  | Production: { name: string; rule: 'a expr; action: 'a -> 'm }

or bind it on the module level:

type 'a m = ...
type production = (* Ɐ a. *)
  | Production: { name: string; rule: 'a expr; action: 'a -> unit m }

You can extend your existential with new operations, provided that they do not leak the existentially quantified type variables.

In fact, you may just use first-class modules, as they have existential type. E.g.,

module type Semantics = sig 
   type t 
   val action : t -> unit
   val rule : t expr
end

type production = (module Semantics)

As a final note, consider Oleg Kiselyov' tagless embedding as an alternative to the GADT representation. Keep in mind that a parser is just an interpreter :)