4
votes

Have a look at this F#/OCaml code:

type AllPossible =
    | A of int
    | B of int*int
    | ...
    | Z of ...

let foo x =
    ....
    match x with
    | A(value) | B(value,_) ->                   (* LINE 1 *)
        (* do something with the first (or only, in the case of A) value *)
        ...
        (* now do something that is different in the case of B *)
        let possibleData = 
            match x with
            | A(a) -> bar1(a)
            | B(a,b) -> bar2(a+b)
            | _ -> raise Exception    (* the problem - read below *)
        (* work with possibleData *)
    ...
    | Z -> ...

So what is the problem? In function foo, we pattern match against a big list of types. Some of the types share functionality - e.g. they have common work to do, so we use "|A | B ->" in LINE 1, above. We read the only integer (in the case of A), or the first integer (in the case of B) and do something with it.

Next, we want to do something that is completely different, depending on whether we work on A or B (i.e. call bar1 or bar2). We now have to pattern match again, and here's the problem: In this nested pattern match, unless we add a 'catchAll' rule (i.e. '_'), the compiler complains that we are missing cases - i.e. it doesn't take into account that only A and B can happen here.

But if we add the catchAll rule, then we have a far worse problem: if at some point we add more types in the list of LINE1 (i.e. in the line '|A | B ->' ... then the compiler will NOT help us in the nested match - the '_' will catch them, and a bug will be detected at RUNTIME. One of the most important powers of pattern matching - i.e. detecting such errors at compile-time - is lost.

Is there a better way to write this kind of code, without having to repeat whatever work is shared amongst A and B in two separate rules for A and B? (or putting the A-and-B common work in a function solely created for the purpose of "local code sharing" between A and B?)

EDIT: Note that one could argue that the F# compiler's behaviour is buggy in this case - it should be able to detect that there's no need for matching beyond A and B in the nested match.

5

5 Answers

5
votes

If the datatype is set in stone - I would also prefer local function.

Otherwise, in OCaml you could also enjoy open (aka polymorphic) variants :

type t = [`A | `B | `C]
let f = function
| (`A | `B as x) ->
  let s = match x with `A -> "a" | `B -> "b" in
  print_endline s
| `C -> print_endline "ugh"
4
votes

I would just put the common logic in a local function, should be both faster and more readable. Matches nested that way is pretty hard to follow, and putting the common logic in a local function allows you to ditch the extra matching in favour of something that'll get inlined anyway.

1
votes

Hmm looks like you need to design the data type a bit differently such as:

type AorB = 
    | A of int
    | B of int * int

type AllPossible =
    | AB of AorB
    | C of int
    .... other values

let foo x =
    match x with
    | AB(v) -> 
        match v with
        | A(i) -> () //Do whatever need for A
        | B(i,v) -> () // Do whatever need for B
    | _ -> ()
1
votes

Perhaps the better solution is that rather than

type All =
    |A of int
    |B of int*int

you have

type All = 
    |AorB of int * (int Option)

If you bind the data in different ways later on you might be better off using an active pattern rather than a type, but the result would be basically the same

1
votes

I don't really agree that this should be seen as a bug - although it would definitely be convenient if the case was handled by the compiler.

The C# compiler doesn't complain to the following and you wouldn't expect it to:

var b = true;

if (b)
    if (!b)
       Console.WriteLine("Can never be reached");