F# Func type inference difference between Seq and PSeq ToDictionary

Question

Given a trivial sequence of tuples, and a parallel one using F# PowerPack's PSeq:

let a = Seq.singleton (1,"a")  --> val a: seq<int * string>
let b = a |> PSeq.map id       --> val b: pseq<int * string>

Now I'd like to create a .Net BCL dictionary from them:

let bd = b.ToDictionary(fst, snd, HashIdentity.Structural)
let ad = a.ToDictionary(fst, snd, HashIdentity.Structural)
let ad2 = a.ToDictionary(fst, snd)
let ad3 = a.ToDictionary((fun (x,y) -> x), (fun (x,y) -> y), HashIdentity.Structural)

While let bd works, let ad does not compile as it fails to infer types and convert to BCL's Func correctly. Interestingly it works just fine if I omit the third parameter, as in let ad2, or if I write out fst and snd manually inline as in let ad3.

This expression was expected to have type Func<(int * string),'a> but here has type 'b * 'c -> 'b

Why does let ad not compile, while all the alternatives work fine?
Is there a way to make let ad compile, without providing inline functions or type annotations?

PS: I need HashIdentity.Structural because in the real code the keys are not integers but tuples or records

Update: I've now defined let dictionary (s : ('a * 'b) seq) = s.ToDictionary((fun (x,y)->x), (fun (x,y)->y), HashIdentity.Structural) so I can just write let ad = a |> dictionary, but I'm still interested in why it won't compile with the fst and snd functions.

I can think of a couple of solutions. Are you planning to create the dictionaries, then only read from them? Or will you be adding/removing entries after creation? — Jack P.
They won't be changed after creation (read only). I'm curious to fully understand why the types cannot be inferred properly when using the fst/snd functions, while it works fine when I write them inline as lambdas, or when using PSeq instead of Seq. — Christoph Rüegg
If you're creating read-only dictionaries, there's already a built-in function for that: dict. See @Daniel's answer for an example. — Jack P.

kvb kvb · Accepted Answer · 2012-10-17T14:49:56

I believe that this is not exactly a bug, but merely a very ugly corner case of F#'s type inference algorithm where overloads and type directed conversions are involved. Oddly, the compiler infers the ad2 binding's type correctly because there's a second overload of ToDictionary that has two arguments, which causes an additional overload resolution step during inference. On the other hand, there is only a single overload with three arguments, so the overload resolution step isn't used when trying to infer ad's type.

As I mentioned, the other piece of the puzzle is the type directed conversion from an F# function to a .NET delegate type (this is how you can pass an F# function where a Func<_,_> is expected). Basically, if you use an explicit lambda or if there are multiple overloads, then this conversion is considered, but if there is only a single overload and no explicit lambda then the conversion isn't considered. This means that the following will also work:

let ad3 = a.ToDictionary(System.Func<_,_>(fst), 
                         System.Func<_,_>(snd), HashIdentity.Structural)

because now there's no need to perform a type directed conversion.

This result is certainly counterintuitive, so I'd hope that there is some way to tweak the type inference algorithm to better handle these corner cases. Unfortunately, interoperating with certain aspects of the .NET type system (such as named delegate types, subtyping, overloading, etc.) makes inference much harder than it might otherwise be, and there may be some reason that the algorithm can't easily be modified to handle this case.

F# Func type inference difference between Seq and PSeq ToDictionary

2 Answers