11
votes

I implemented a count function in Haskell and I am wondering will this behave badly on large lists :

count   :: Eq a => a -> [a] -> Int
count x =  length . filter (==x)

I believe the length function runs in linear time, is this correct?

Edit: Refactor suggested by @Higemaru

3
If you are really paranoid, you can also always do the equivalent of a single loop: count needle haystack = foldl' (\acc elem -> if elem == needle then acc+1 else acc) 0 haystack.kqr
You can't really count n items any faster than O(n), so I'm not sure what your worry is.augustss
@augustss - my worry was if I was doing a double pass over a list.user2913694
@decentral1se Thanks to laziness it will behave more like a single pass.augustss

3 Answers

10
votes

Length runs in linear time to the size of the list, yes.

Normally, you would be worried that your code had to take two passes through the list: first one to filter and then one to count the length of the resulting list. However, I believe this does not happen here because filter is not strict on the structure of the list. Instead, the length function forces the elements of the filtered list as it goes along, doing the actual count in one pass.

6
votes

I think you can make it slightly shorter

count :: Eq a => a -> [a] -> Int
count x = length . filter (x==)

(I would have written a (lowly) comment if I could)

0
votes

That really depends on the list. For a normal, lazily evaluated list of Ints on my computer, I see this function running in about 2 seconds for 10^9 elements, 0.2 seconds for 10^8, and 0.3 seconds for 10^7, so it appears to run in linear time. You can check this yourself by passing the flags +RTS -s -RTS to your executable when running it from the command line.

I also tried running it with more cores, but it doesn't seem to do anything but increase the memory usage a bit.

An added bonus of the lazy computation is that you only make a single pass over the list. filter and length get turned into a single loop by the compiler (with optimizations turned on), so you save memory and efficiency.