31
votes

If I want to call more than one C function, each one depending on the result of the previous one, is it better to create a wrapper C function that handles the three calls? Will it cost the same as using Haskell FFI without converting types?

Suppose I have the following Haskell code:

foo :: CInt -> IO CInt
foo x = do
  a <- cfA x
  b <- cfB a
  c <- cfC c
  return c

Each function cf* is a C call.

Is it better, in terms of performance, to create a single C function like cfABC and make only one foreign call in Haskell?

int cfABC(int x) {
   int a, b, c;
   a = cfA(x);
   b = cfB(a);
   c = cfC(b);
   return c;
}

Haskell code:

foo :: CInt -> IO CInt
foo x = do
  c <- cfABC x
  return c

How to measure the performace cost of a C call from Haskell? Not the cost of the C function itself, but the cost of the "context-switching" from Haskell to C and back.

2
I'm not at all sure, but I found this blog post enlightening. If I interpret it correctly, foreign ccall unsafe (with unsafe being the key), is essentially as cheap as an inline C function call. However, great care has to be taken when using unsafe, and the safe variant (foreign ccall) costs more and involves taking locks.gspr
You'd think any difference would be compiled away...Colin Woodbury
@fosskers: what do you mean?gspr
@ThiagoNegri: I did some crude (non-Criterion) benchmarks to compare foreign ccall and foreign ccall unsafe. I have a C function that, given double x returns sin(x)*sin(x)*cos(x)/2.0. I compiled it with GCC 4.7.2 and -O2. The benchmark calls it with 100000000 different arguments from 0 to pi/2 and sums the results. With foreign ccall it ran in about 9.6 seconds, compared to 4.6 seconds for foreign ccall unsafe. Calling it from an actual C program gave a running time of 4.4-4.5 seconds. This gives you some idea, at least. The Haskell code was compiled with GHC 7.4.2.gspr
@gspr: Forget I said anything. My knowledge of the FFI is insufficient.Colin Woodbury

2 Answers

20
votes

The answer depends mostly on whether the foreign call is a safe or an unsafe call.

An unsafe C call is basically just a function call, so if there's no (nontrivial) type conversion, there are three function calls if you make three foreign calls, and between one and four when you write a wrapper in C, depending on how many of the component functions can be inlined when compiling the C, since a foreign call into C cannot be inlined by GHC. Such a function call is generally very cheap (it's just a copy of the arguments and a jump to the code), so the difference is small either way, the wrapper should be slightly slower when no C function can be inlined into the wrapper, and slightly faster when all can be inlined [and that was indeed the case in my benchmarking, +1.5ns resp. -3.5ns where the three foreign calls took about 12.7ns for everything just returning the argument]. If the functions do something nontrivial, the difference is negligible (and if they're not doing anything nontrivial, you'd probably better write them in Haskell to let GHC inline the code).

A safe C call involves saving some nontrivial amount of state, locking, possibly spawning a new OS thread, so that takes much longer. Then the small overhead of perhaps calling one function more in C is negligible compared to the cost of the foreign calls [unless passing the arguments requires an unusual amount of copying, many huge structs or so]. In my do-nothing benchmark

{-# LANGUAGE ForeignFunctionInterface #-}
module Main (main) where

import Criterion.Main
import Foreign.C.Types
import Control.Monad

foreign import ccall safe "funcs.h cfA" c_cfA :: CInt -> IO CInt
foreign import ccall safe "funcs.h cfB" c_cfB :: CInt -> IO CInt
foreign import ccall safe "funcs.h cfC" c_cfC :: CInt -> IO CInt
foreign import ccall safe "funcs.h cfABC" c_cfABC :: CInt -> IO CInt

wrap :: (CInt -> IO CInt) -> Int -> IO Int
wrap foo arg = fmap fromIntegral $ foo (fromIntegral arg)

cfabc = wrap c_cfABC

foo :: Int -> IO Int
foo = wrap (c_cfA >=> c_cfB >=> c_cfC)

main :: IO ()
main = defaultMain
            [ bench "three calls" $ foo 16
            , bench "single call" $ cfabc 16
            ]

where all the C functions just return the argument, the mean for the single wrapped call is a bit above 100ns [105-112], and for the three separate calls around 300ns [290-315].

So a safe c call takes roughly 100ns and usually, it is then faster to wrap them up into a single call. But still, if the called functions do something sufficiently nontrivial, the difference won't matter.

-3
votes

That probably depends very much on your exact Haskell compiler, the C compiler, and the glue binding them together. The only way to find out for sure is to measure it.

On a more philosophical tune, each time you mix languages you create a barrier for newcommers: In this case it isn't enough to be fluent in Haskell and C (that already gives a narrow set), but you also have to know the calling conventions and whatnot enough to work with them. And many times there are subtle issues to handle (even calling C from C++, which are very similar languages isn't at all trivial). Unless there are very compelling reasons, I'd stick with a single language. The only exception I can think of offhand is for creating e.g. Haskell bindings to a preexisting complex library, something like NumPy for Python.