I'm trying to ensure that GHC specializes a recursive function to so that everything gets unboxed. The full example code (as well as a dump of the GHC core) is available in this gist. The function in question looks like this:
import Data.Bits
import qualified Data.Vector.Unboxed as UV
lookupSorted :: Ord a => (Int -> a) -> Int -> a -> Maybe Int
lookupSorted lookupIx len needle =
let !r = go 0 (len - 1)
in if r < 0 then Nothing else Just r
where
go :: Int -> Int -> Int
go !lo !hi = if lo <= hi
then
let !mid = lo + (unsafeShiftR (hi - lo) 1)
!val = lookupIx mid
in case compare val needle of
EQ -> mid
LT -> go (mid + 1) hi
GT -> go lo (mid - 1)
else (-1)
It's an algorithm that looks up a value from any sorted container than can be indexed into. The two functions that I want to ensure are specialized versions of this are:
{-# NOINLINE lookupUnboxedWord #-}
lookupUnboxedWord :: UV.Vector Word -> Word -> Maybe Int
lookupUnboxedWord v w = lookupSorted (UV.unsafeIndex v) (UV.length v) w
{-# NOINLINE lookupUnboxedDouble #-}
lookupUnboxedDouble :: UV.Vector Double -> Double -> Maybe Int
lookupUnboxedDouble v w = lookupSorted (UV.unsafeIndex v) (UV.length v) w
The good news is that, from looking at the dumped core, I can see that GHC is already performing the specialization that I'm interested in. This is pretty impressive. However, I'd like to be able to count on it happening. I'm concerned that if I add enough specialized variants to this file or that if I call lookupSorted
from another module, GHC might eventually lean in favor of a small generated executable rather than a fast one.
My understanding is that the SPECIALIZE
pragma does not help in the situation. GHC currently does not allow specialization based on value arguments. I'm pretty sure that if I'm willing to write a typeclass for the indexing operation, then I could make SPECIALIZE
work. I'm trying to avoid this approach though because I don't want to introduce a typeclass unless there's no other solution.
Is there a way to force GHC to create these specialized variants of my function? Additionally, if anyone has any commentary on the dumped core file (if something is not optimal), I would appreciate any feedback on that. Thanks.
----EDIT----
Thinking about this more, it seems like it might be sufficient to simply put an INLINE
pragma on lookupSorted
. The GHC docs are not clear on the interaction between INLINE
and recursive binding inside let
or where
clauses. Any clarification on this, hopefully with a source to back it up, could be helpful.