I heard a lot about amazing performance of programs written in Haskell, and wanted to make some tests. So, I wrote a 'library' for matrix operations just to compare it's performance with the same stuff written in pure C. First of all I tested 500000 matrices multiplication performance, and noticed that it was... never-ending (i. e. ending with out of memory exception after 10 minutes of so)! After studying haskell a bit more I managed to get rid of laziness and the best result I managed to get is ~20 times slower than its equivalent in C. So, the question: could you review the code below and tell if its performance can be improved a bit more? 20 times is still disappointing me a bit.
import Prelude hiding (foldr, foldl, product)
import Data.Monoid
import Data.Foldable
import Text.Printf
import System.CPUTime
import System.Environment
data Vector a = Vec3 a a a
| Vec4 a a a a
deriving Show
instance Foldable Vector where
foldMap f (Vec3 a b c) = f a `mappend` f b `mappend` f c
foldMap f (Vec4 a b c d) = f a `mappend` f b `mappend` f c `mappend` f d
data Matr a = Matr !a !a !a !a
!a !a !a !a
!a !a !a !a
!a !a !a !a
instance Show a => Show (Matr a) where
show m = foldr f [] $ matrRows m
where f a b = show a ++ "\n" ++ b
matrCols (Matr a0 b0 c0 d0 a1 b1 c1 d1 a2 b2 c2 d2 a3 b3 c3 d3)
= [Vec4 a0 a1 a2 a3, Vec4 b0 b1 b2 b3, Vec4 c0 c1 c2 c3, Vec4 d0 d1 d2 d3]
matrRows (Matr a0 b0 c0 d0 a1 b1 c1 d1 a2 b2 c2 d2 a3 b3 c3 d3)
= [Vec4 a0 b0 c0 d0, Vec4 a1 b1 c1 d1, Vec4 a2 b2 c2 d2, Vec4 a3 b3 c3 d3]
matrFromList [a0, b0, c0, d0, a1, b1, c1, d1, a2, b2, c2, d2, a3, b3, c3, d3]
= Matr a0 b0 c0 d0
a1 b1 c1 d1
a2 b2 c2 d2
a3 b3 c3 d3
matrId :: Matr Double
matrId = Matr 1 0 0 0
0 1 0 0
0 0 1 0
0 0 0 1
normalise (Vec4 x y z w) = Vec4 (x/w) (y/w) (z/w) 1
mult a b = matrFromList [f r c | r <- matrRows a, c <- matrCols b] where
f a b = foldr (+) 0 $ zipWith (*) (toList a) (toList b)
mult
is changing representations between matrices, lists and back to matrices. Whilst this does allow you to do the the computation in two lines of code, it will be very, very slow. – stephen tetleySTM
andparallel
, which make adding parallelism very easy compared to most other languages. For pure numerical work, it's quite difficult for Haskell performance to beat C implementations. – John L(a*m+b*p+c*s)
. Of course, the code is long winded for the 4x4 matrix. Lists are simply "wrong" for matrices as indexing is slow, they have no bounds, etc. – stephen tetley