8
votes

Usually questions in the Haskell tag are why is haskell so slow compared to X. Mostly you can tie that down to usages of String instead of Text or ByteString. Non-strict evaluation or lack of type signatures.

But here I have a simple fibonacci calculator that outperforms C++ by about a factor of 2. This might be either a lack of c++ knowledge - but I got the code from a friend, who used to code more than just a bit in this language.

★ g++ -O3 fib2.cc -o cc-fib -lgmpxx -lgmp 
★ time ./cc-fib > /dev/null
./cc-fib > /dev/null  8,23s user 0,00s system 100% cpu 8,234 total

★ ghc -O3 --make -o hs-fib fib1.hs
[1 of 1] Compiling Main             ( fib1.hs, fib1.o )
Linking hs-fib ...
★ time ./hs-fib > /dev/null
./hs-fib > /dev/null  4,36s user 0,03s system 99% cpu 4,403 total

In the haskell file I used just simply a strict zipWith' and a strict add' function (this is where the extension BangPatterns is used - it just tells the compiler to evaluate the arguments x/y before performing a the addition) as well as adding an explicit type signature.

Both versions use bigint, so this seems comparable to me, also the c++ code does not use the "standard" recursion which has an exponential run-time but a memoized version that should behave nicely (or at least that's what I think - please correct me if I'm wrong).

Setup used is:

  • Linux 64-bit (Mint) on a fairly recent laptop
  • ghc-7.10.3
  • g++ 4.8.4 + libgmp-dev 2:5.1.3+dfsg-1ubuntu1

fib.cc

#include <iostream>
#include <gmpxx.h>

mpz_class fib(int n) {
    mpz_class p1 = 0;
    mpz_class p2 = 1;
    mpz_class result;
    if ( n == 0 ) return 0;
    if ( n == 1 ) return 1;
    for(int i = 2; i <= n ; i ++ ) {
        result = p1 + p2;
        p1 = p2;
        p2 = result;
    }
    return result;
}

int main () {
    std::cout<<fib(1000000)<<std::endl;
    return 0;
}

fib.hs

{-# LANGUAGE BangPatterns -#}
module Main where

fib1 :: [Integer]
fib1 = 0:1:zipWith' (add') fib1 (tail fib1)
     where zipWith' :: (Integer -> Integer -> Integer) -> [Integer] -> [Integer] -> [Integer]
           zipWith' _ [] _ = []
           zipWith' _ _ [] = []
           zipWith' f (x:xs) (y:ys) = let z = f x y in z:zipWith' f xs ys
           add' :: Integer -> Integer -> Integer
           add' !x !y = let z = x + y in z `seq` z

fib4 :: [Integer]
fib4 = 0:1:zipWith (+) fib4 (tail fib4)

main :: IO ()
main = print $ fib1 !! 1000000
1
This seems more like an issue of GMP's design tradeoffs vs your Haskell implementation's Integer, not C++ itself vs Haskell.user2357112 supports Monica
On second thought, you might be doing a bunch of unnecessary copies in the C++. I'm not sure how mpz_class's operator= works.user2357112 supports Monica
Quoting from homepage "GMP is carefully designed to be as fast as possible, both for small operands and for huge operands" I thought gmp would be the correct choice for a fast implementation. Can you tell more about those design choices?epsilonhalbe
@user2357112: GHC uses GMP, so it's probably not that. (Technically this is only one possible compile-time configuration, but I've never seen any other in practice; the other one I know is "integer-simple" and should be slower.)Antal Spector-Zabusky
You can probably speed up the C++ implementation by replacing p1 = p2; p2 = result; with p1.swap(p2); p2.swap(result);Benjamin Lindley

1 Answers

9
votes

Given the really huge number you are printing, the poor default performance of iostreams may have something to do with it. Indeed, on my system, putting

 std::ios_base::sync_with_stdio(false);

at the beginning of main improved the time slightly (from 20 seconds to 18).

Also, copying around the huge numbers so much is bound to slow it down. If instead, you update both p1 and p2 in each step, there's no need to copy them around. You also only need half as many steps in the loop. Like this:

mpz_class fib(int n) {
    mpz_class p1 = 0;
    mpz_class p2 = 1;
    for(int i = 1; i <= n/2 ; i ++ ) {
        p1 += p2;
        p2 += p1;
    }
    return (n % 2) ? p2 : p1;
}

This dramatically speeds it up on my system (from 18 seconds to 8).

Of course, to really see how fast it can be done with GMP, you should just use the function that does that:

mpz_class fib(int n) {
    mpz_class result;
    mpz_fib_ui(result.get_mpz_t(), n);
    return result;
}

This is effectively instantaneous on my machine (and yes, it prints the same 208,989-digit number that the other two methods do).