There are two conventions I've found in Coq's SSReflect extension that seem particularly useful but which I haven't seen widely adopted in newer dependently-typed languages (Lean, Agda, Idris).
Firstly, where possible predicates are expressed as boolean-returning functions rather than inductively defined datatypes. This brings decidability by default, opens up more opportunities for proof by computation, and improves checking performance by avoiding the need for the proof engine to carry around large proof terms. The main disadvantage I see is the need to use reflection lemmas to manipulate these boolean predicates when proving.
Secondly, datatypes with invariants are defined as dependent records containing a simple datatype plus a proof of the invariant. For instance, fixed length sequences are defined in SSReflect like:
Structure tuple_of : Type := Tuple {tval :> seq T; _ : size tval == n}.
A seq
and a proof of that sequence's length being a certain value. This is opposed to how e.g. Idris defines this type:
data Vect : (len : Nat) -> (elem : Type) -> Type
A dependently typed datastructure in which the invariant is part of its type. One advantage of SSReflect's approach is that it allows reuse, so that for instance many of the functions defined for seq
and proofs about them can still be used with tuple
(by operating on the underlying seq
), whereas with Idris' approach functions like reverse
, append
and the like need to be rewritten for Vect
. Lean actually has a SSReflect style equivalent in its standard library, vector
, but it also has an Idris-style array
which seems to have an optimised implementation in the runtime.
One SSReflect-oriented book even claims the Vect n A
style approach is an antipattern:
A common anti-pattern in dependently-typed languages and Coq in particular is to encode such algebraic properties into the definitions of the datatypes and functions themselves (a canonical example of such approach are length-indexed lists). While this approach looks appealing, as it demonstrates the power of dependent types to capture certain properties of datatypes and functions on them, it is inherently non-scalable, as there will be always another property of interest, which has not been foreseen by a designer of the datatype/function, so it will have to be encoded as an external fact anyway. This is why we advocate the approach, in which datatypes and functions are defined as close to the way they would be defined by a programmer as possible, and all necessary properties of them are proved separately.
My question is hence, why haven't these approaches been more widely adopted. Are there disadvantages I'm missing, or maybe their advantages are less significant in languages with better support for dependent pattern matching than Coq?