7
votes

Does GHC ever unpack sum types when passing them to functions? For example, let's say that we have the following type:

data Foo
  = Foo1 {-# UNPACK #-} !Int {-# UNPACK #-} !Word
  | Foo2 {-# UNPACK #-} !Int
  | Foo3 {-# UNPACK #-} !Word

Then I define a function that is strict in its Foo argument:

consumeFoo :: Foo -> Int
consumeFoo x = case x of ...

At runtime, when I call consumeFoo, what can I expect to happen? The GHC calling convention is to pass arguments in registers (or on the stack once there are too many). I can see two ways that the argument passing could go:

  1. A pointer to a Foo on the heap gets passed in as one argument.
  2. A three-argument representation of Foo is used, one argument representing the data constructor that was used and the other two representing the possible Int and Word values in the data constructor.

I would prefer the second representation, but I don't know if it is actually what happens. I am aware of UnpackedSumTypes landing in GHC 8.2, but it's unclear if it does what I want. If I had instead written the function as:

consumeFooAlt :: (# (# Int#, Word# #) | Int# | Word# #) -> Int

Then I would expect that evaluation (2) would be what happens. And the Unpacking section of the unpacked sums page indicates that I could do this as well:

data Wrap = Wrap {-# UNPACK #-} !Foo
consumeFooAlt2 :: Wrap -> Int

And that should also have the representation I want, I think.

So my question is, without using a wrapper type or a raw unpacked sum, how can I guarentee that a sum is unpacked into registers (or onto the stack) when I pass it as an argument to a function? If it is possible, is it something that GHC 8.0 can already do, or is it something that will only be available in GHC 8.2?

2

2 Answers

7
votes

First: Guaranteed optimization and GHC don't mix well. Due to the high level it is very hard to predict the code that GHC will generate in every case. The only way to be sure is to look at the Core. If you're developing an extremely performance dependent application with GHC, then you need to become familar with Core I.

I am not aware of any optimization in GHC that does exactly what you describe. Here is an example program:

module Test where

data Sum = A {-# UNPACK #-} !Int | B {-# UNPACK #-} !Int

consumeSum :: Sum -> Int
consumeSum x = case x of
  A y -> y + 1
  B y -> y + 2

{-# NOINLINE consumeSumNoinline #-}
consumeSumNoinline = consumeSum

{-# INLINE produceSumInline #-}
produceSumInline :: Int -> Sum
produceSumInline x = if x == 0 then A x else B x

{-# NOINLINE produceSumNoinline #-}
produceSumNoinline :: Int -> Sum
produceSumNoinline x = if x == 0 then A x else B x

test :: Int -> Int
--test x = consumeSum (produceSumInline x)
test x = consumeSumNoinline (produceSumNoinline x)

Let's first look at what happens if we don't inline consumeSum nor produceSum. Here is the core:

test :: Int -> Int
test = \ (x :: Int) -> consumeSumNoinline (produceSumNoinline x)

(produced with ghc-core test.hs -- -dsuppress-unfoldings -dsuppress-idinfo -dsuppress-module-prefixes -dsuppress-uniques)

Here, we can see that GHC (8.0 in this case) does not unbox the sum type passed as a function argument. Nothing changes if we inline either consumeSum or produceSum.

If we inline both however, then the following code is generated:

test :: Int -> Int
test =
  \ (x :: Int) ->
    case x of _ { I# x1 ->
    case x1 of wild1 {
      __DEFAULT -> I# (+# wild1 2#);
      0# -> lvl1
    }
    }

What happened here is that through inlining, GHC ends up with:

\x -> case (if x == 0 then A x else B x) of
   A y -> y + 1
   B y -> y + 2

Which through the case-of-case (if is just a special case) is turned into:

\x -> if x == 0 then case (A x) of ...  else case (B x) of ...

Now that is a case with a known constructor, so GHC can reduce the case at compile time ending up with:

\x -> if x == 0 then x + 1 else x + 2

So it completely eliminated the constructor.


In summary, I believe that GHC does not have any concept of an "unboxed sum" type prior to version 8.2, which also applies to function arguments. The only way to get "unboxed" sums is to get the constructor eliminated completely through inlining.

2
votes

If you need such an optimization, your simplest solution is to do it yourself. I think there are actually many ways to achieve this, but one is:

data Which = Left | Right | Both
data Foo = Foo Which Int Word

The unpacking of any fields of this type is completely irrelevant to the question of the 'shape of the representation', which is what you are really asking about. Enumerations are already highly optimized - only one value for every constructor is ever created - so the addition of this field doesn't affect performance. The unpacked representation of this type is precisely what you want - one word for Which constructor and one for each field.

If you write your functions in the proper way, you get the proper code:

data Which = Lft | Rgt | Both
data Foo = Foo Which {-# UNPACK #-} !Int {-# UNPACK #-} !Word 

consumeFoo :: Foo -> Int 
consumeFoo (Foo w l r) = 
  case w of 
    Lft  -> l 
    Rgt  -> fromIntegral r 
    Both -> l + fromIntegral r 

The generated core is quite obvious:

consumeFoo :: Foo -> Int
consumeFoo =
  \ (ds :: Foo) ->
    case ds of _ { Foo w dt dt1 ->
    case w of _ {
      Lft -> I# dt;
      Rgt -> I# (word2Int# dt1);
      Both -> I# (+# dt (word2Int# dt1))
    }
    }

However, for simple programs such as:

consumeFoos = foldl' (+) 0 . map consumeFoo 

This optimization makes no difference. As is indicated in the other answer, the inner function consumeFoo is just inlined:

Rec {
$wgo :: [Foo] -> Int# -> Int#
$wgo =
  \ (w :: [Foo]) (ww :: Int#) ->
    case w of _ {
      [] -> ww;
      : y ys ->
        case y of _ {
          Lft dt -> $wgo ys (+# ww dt);
          Rgt dt -> $wgo ys (+# ww (word2Int# dt));
          Both dt dt1 -> $wgo ys (+# ww (+# dt (word2Int# dt1)))
        }
    }
end Rec }

vs.

Rec {
$wgo :: [Foo] -> Int# -> Int#
$wgo =
  \ (w :: [Foo]) (ww :: Int#) ->
    case w of _ {
      [] -> ww;
      : y ys ->
        case y of _ { Foo w1 dt dt1 ->
        case w1 of _ {
          Lft -> $wgo ys (+# ww dt);
          Rgt -> $wgo ys (+# ww (word2Int# dt1));
          Both -> $wgo ys (+# ww (+# dt (word2Int# dt1)))
        }
        }
    }
end Rec }

Which, in almost every case when working with low-level, unpacked data, is the goal anyways, as most of your functions are small and cost little to inline.