Haskell - List comprehension with infinite lists

Question

This is the piece of code

primepowers n = foldr merge [] [ map (^i) primes | i <- [1..n] ]  -- (1)

merge::(Ord t) =>[t]->[t]->[t]
merge x [] = x
merge [] y = y
merge (x:xs) (y:ys)
    | x < y = x:merge xs (y:ys)
    | otherwise = y:merge (x:xs) ys

which is equal to the mathematical expression {p^i | p is prime, 1 <= i <= n}.

prime returns an infinite list of prime numbers. What I am interested is in the evaluation of (1). These are my thoughts:

If we first just look at [ map (^i) primes | i <- [1..3] ] this would return an infinite list of [[2,3,5,7,9,...],...]. But as we know p^1 (p is prime) never ends, Haskell will never evaluate [p^2] and [p^3]. Is this just because it is an infinite list or because of lazy evaluation?

Let's carry on with merge: merge will return [2,3,5,7,9,11,...] because again we still have an infinite list or because of some other reason?

Now to foldr: foldr starts evaluating from back. Here with specifically ask for the rightmost element, which is a infinite list [p^3]. So the evaluation would be like this

merge (merge (merge [] [p^3]) [p^2]) [p^1]

But we should not forget that these lists are infinite, so how does Haskell deal with that fact?

Could anyone explain me the evaluation process of the above function?

One good way to see when something gets evaluated is to use the Debug.Trace module. — Mokosha
You can not use foldr on infinite lists because it will search forever for the last element. — ditoslav
@DominikDitoIvosevic, that's not true. Try foldr const 0 [1..] — user3237465
@DominikDitoIvosevic, stackoverflow.com/questions/3082324/… — user3237465
for some background, cf. joerg.endrullis.de/research.html#productivity — d8d0d65b3f7cf42

Will Ness Will Ness · Accepted Answer · 2014-11-13T15:33:06

The trick is to define it as

primepowers n = foldr (\(x:xs) r-> x:merge xs r) 
                      [] [ map (^i) primes | i <- [1..n] ]

^{(as seen in Richard Bird's code in the article O'Neill, Melissa E., "The Genuine Sieve of Eratosthenes").}

The lists to the right of a current one all start with bigger numbers, there's no chance of their merged list ever producing a value smaller or equal to the current list's head, so it can be produced unconditionally.

That way it will also explore only as many of the internal streams as needed:

GHCi> let pps_list = [ map (^i) primes | i <- [1..42] ]
GHCi> :sprint pps_list
pps_list = _
GHCi> take 20 $ foldr (\(x:xs) r-> x:merge xs r) [] pps_list
[2,3,4,5,7,8,9,11,13,16,17,19,23,25,27,29,31,32,37,41]
GHCi> :sprint pps_list
pps_list = (2 : 3 : 5 : 7 : 11 : 13 : 17 : 19 : 23 : 29 : 31 : 37 :
            41 : _) :
           (4 : 9 : 25 : 49 : _) : (8 : 27 : 125 : _) : (16 : 81 : _) :
           (32 : 243 : _) : (64 : _) : _

To your question per se, foldr f z [a,b,c,...,n] = f a (f b (f c (... (f n z)...))) so (writing ps_n for map (^n) primes), your expression is equivalent to

merge ps (merge ps_2 (merge ps_3 (... (merge ps_n [])...)))
= merge ps r
     where r = merge ps_2 (merge ps_3 (... (merge ps_n [])...))

because you use merge as your combining function. Notice that the leftmost merge springs into action first, while the expression for r isn't even built yet (because its value wasn't yet needed - Haskell's evaluation is by need.)

Now, this merge demands the head value of both its first and second argument (as written, it actually checks the second argument first, for being []).

The first argument isn't the problem, but the second is the result of folding all the rest of the lists ("r" in foldr's combining function stands for "recursive result"). Thus, each element in the list will be visited and its head element forced - and all this just to produce one very first value, the head of the result list, by the leftmost merge call...

In my code, the combining function does not at first demand the head of its second argument list. That's what limits its exploration of the whole list of lists, makes it more economic in its demands, and thus more productive (it will even work if you just omit the n altogether).

Your example Haskell expression [ map (^i) primes | i <- [1..3] ] returns finite list of length 3, each element being an infinite list: [[2,3,5,7,11,...],[4,9,25,...],[8,27,125,...]] so foldr has no problem translating it into merge [2,3,5,7,11,...] (merge [4,9,25,...] (merge [8,27,125,..] [])):

foldr merge [] [ map (^i) primes | i <- [1..3] ]
= merge [2,3,5,7,11,...] (foldr merge [] [ map (^i) primes | i <- [2..3] ])
= merge [2,3,5,7,11,...] (merge [4,9,25,...] (foldr merge [] [ map (^i) primes | i <- [3..3] ]))
= merge [2,3,5,7,11,...] (merge [4,9,25,...] (merge [8,27,125,..] (foldr merge [] [])))
= merge [2,3,5,7,11,...] (merge [4,9,25,...] (merge [8,27,125,..] []))
= merge [2,3,5,7,11,...] (merge [4,9,25,...] [8,27,125,..])
= merge [2,3,5,7,11,...] (4:merge [9,25,...] [8,27,125,..])
= 2:merge [3,5,7,11,...] (4:merge [9,25,...] [8,27,125,..])
= 2:3:merge [5,7,11,...] (4:merge [9,25,...] [8,27,125,..])
= 2:3:4:merge [5,7,11,...] (merge [9,25,...] [8,27,125,..])
= 2:3:4:merge [5,7,11,...] (8:merge [9,25,...] [27,125,..])
= 2:3:4:5:merge [7,11,...] (8:merge [9,25,...] [27,125,..])
.....

As you can see, the rightmost inner list is examined first, because merge is strict in (i.e. demands to know) both its arguments, as explained above. For [ map (^i) primes | i <- [1..42] ] it would expand all 42 of them, and examine the heads of all of them, before producing even the head element of the result.

With the tweaked function, mg (x:xs) r = x:merge xs r, the evaluation proceeds as

foldr mg [] [ map (^i) primes | i <- [1..3] ]
= mg [2,3,5,7,11,...] (foldr mg [] [ map (^i) primes | i <- [2..3] ])
= 2:merge [3,5,7,11,...] (foldr mg [] [ map (^i) primes | i <- [2..3] ])
= 2:merge [3,5,7,11,...] (mg [4,9,25,...] (foldr mg [] [ map (^i) primes | i <- [3..3] ]))
= 2:merge [3,5,7,11,...] (4:merge [9,25,...] (foldr mg [] [ map (^i) primes | i <- [3..3] ]))
= 2:3:merge [5,7,11,...] (4:merge [9,25,...] (foldr mg [] [ map (^i) primes | i <- [3..3] ]))
= 2:3:4:merge [5,7,11,...] (merge [9,25,...] (foldr mg [] [ map (^i) primes | i <- [3..3] ]))
= 2:3:4:merge [5,7,11,...] (merge [9,25,...] (mg [8,27,125,..] (foldr mg [] [])))
= 2:3:4:merge [5,7,11,...] (merge [9,25,...] (8:merge [27,125,..] (foldr mg [] [])))
= 2:3:4:merge [5,7,11,...] (8:merge [9,25,...] (merge [27,125,..] (foldr mg [] [])))
= 2:3:4:5:merge [7,11,...] (8:merge [9,25,...] (merge [27,125,..] (foldr mg [] [])))
.....

so it starts producing the results much sooner, without expanding much of the inner lists. This just follows the definition of foldr,

foldr f z (x:xs) = f x (foldr f z xs)

where, because of the laziness, (foldr f z xs) is not evaluated right away if f does not demand its value (or a part of it, like its head).

Haskell - List comprehension with infinite lists

4 Answers