Recently, I've been TA-ing on an introduction to functional programming course, using Haskell. This is a very enlightening experience, as it once again proves the notion that the best way to learn something is to teach it.

Since Haskell is rife with topics that I feel I understand but are still tricky to actually explain^{1}, trying to answer student question can sometimes prove more challenging than expected. This post is my attempt to answer one such question that I got the other day.

Disclaimer: this post is aimed at Haskell beginners or teachers.

As part of their homework, the students had to reimplement some basic list functions on their own. Part of the requirements were that when possible a function accepting a list should seamlessly work for both finite and infinite lists. Given that Haskell is lazy by default, this tends to happen magically on its own, without any extra effort. But trying to understand when this works or not is another matter.

Here I'll try to add some intuition for the following question:

When is a function that operates on lists is "lazy enough" to handle infinite lists?

The answer I'm going to propose is that^{2}:

A function on lists is "lazy enough

^{6}" when for all inputs it uses a finite chunk of the input to produce a finite chunk of the output.

The answer might be even more confusing, so let's check out some examples to get intuition.

The `head`

function^{3}:

```
head (x : _) = x
head [] = error "head on an empty list"
```

`head`

takes in a list, a potentially infinite list, but it only looks at the head of the list (a finite chunk) and ignores the tail (a potentially infinite chunk of data). So `head`

is "lazy enough" to handle infinite lists:

```
head [1..] == 1
```

What about `tail`

?

```
tail (_ : xs) = xs
tail [] = error "tail on an empty list"
```

To produce a finite chunk of the tail, we only need to skip the head (a finite chunk) and then produce the (finite) chunk of the tail of the list. So this works:

```
head $ tail [1..] == 2
```

Similarly functions like `take n`

, only consume a finite chunk of elements (`n`

) to produce a finite chunk of elements (a list of `n`

elements).

What are examples that are not lazy enough?

Take `length`

for example:

```
length [] = 0
length (_ : xs) = 1 + length xs
```

Here, to produce a finite chunk of the result (a single number), we must consume the whole list (a potentially infinite chunk of data). So there's no way this can work for infinite lists.

```
length [1..] -- never terminates
```

The same goes for `reverse`

: the first element of the result of `reverse`

is the last element of the input list. To produce a finite chunk of the result (the last element of the input), we must consume the whole, potentially infinite, list. And so `reverse`

does not work for infinite lists.

An important thing to note is that different implementations of the same function can behave differently with respect to laziness depending on the implementation details. Consider the following implementation of `head`

:

```
badHead xs =
if length xs > 0 then head' xs
else error "head on empty list"
where
head' (x : _) = x
```

This implementation tries to be "smart" about checking the error condition. But now in order to produce a finite chunk (the head of the list) we must consume the whole (potentially infinite) list to check its length. The new implementation of `head`

no longer works for infinite lists:

```
badHead [1..] -- never terminates
```

Now let's take a look at something more complicated. Figuring out when the functions above are lazy enough or not is probably not that difficult. Let's analyze a more challenging function: `partialSums`

.

Given a list of numbers, `partialSums`

creates a new list of sums of the numbers up to the current element:

```
partialSums [1, 2, 3, 4, 5] == [0, 1, 3, 6, 10, 15]
```

And it should also work for infinite lists:

```
take 6 $ partialSums [1..] == [0, 1, 3, 6, 10, 15]
```

We can implement this function using explicit recursion:

```
partialSums [] = [0]
partialSums (x : xs) = 0 : map (x +) (partialSums xs)
```

Why does this work for infinite lists? Let's say we want the first element of `partialSums [1..]`

. We can follow the equations in this definition^{4}:

```
partialSums [1..] == -- pattern matching
partialSums (1 : [2..]) == -- by definition
0 : map (1 +) (partialSums [2..])
```

Assuming that `(:)`

is lazy (it is), we can access the first element `0`

without accessing the rest of the (infinite) result.

To be able to continue evaluation, recall the definition of `map`

:

```
map _ [] = []
map f (x : xs) = f x : map f xs
```

`map`

itself is lazy, to get the first element of its result, we only need to evaluate `f`

on the head of the result. No need to go over the possibly infinite tail.

Let's continue evaluation and grab the next element:

```
0 : map (1 +) (partialSums [2..]) == -- evaluating the first step of `partialSums` so that we can pattern match on the result
0 : map (1 +) (0 : (map (2 +) (partialSums [3..]))) == -- by definition of `map`
0 : (1 + 0) : (map (1 +) (map (2 +) (partialSums [3..]))) == -- reduce arithmetic
0 : 1 : (map (1 +) (map (2 +) (partialSums [3..])))
```

Notice how pattern matching forced us to evaluate the next step of `partialSums`

, we can't choose the correct branch of `map`

without knowing whether the list is empty or not. But we don't need anything beyond that, we do not evaluate the whole list, just the existence of the first head/tail pair. More generally, pattern matching is one way to gradually force the evaluation of lists.

The next element `1`

is available for grabbing, while the rest of the tail is not evaluated by `(:)`

.

Just for fun, we'll repeat this once again for the third element:

```
0 : 1 : (map (1 +) (map (2 +) (partialSums [3..]))) == -- evaluate `partialSums`
0 : 1 : (map (1 +) (map (2 +) (0 : (map (3 +) (partialSums [4..]))))) == -- evaluate the second `map`
0 : 1 : (map (1 +) ((2 + 0) : (map (2 +) (map (3 +) (partialSums [4..]))))) == -- reduce arithmetic
0 : 1 : (map (1 +) (2 : (map (2 +) (map (3 +) (partialSums [4..]))))) == -- apply first `map`
0 : 1 : (1 + 2) : (map (1 +) (map (2 +) (map (3 +) (partialSums [4..])))) == -- reduce arithmetic
0 : 1 : 3 : (map (1 +) (map (2 +) (map (3 +) (partialSums [4..]))))
```

And now `3`

is available to us using a finite amount of elements from the input list. We can go on like this, every time consuming just one element from the input list to produce one element from the output. Since `(:)`

and `map`

are both lazy, we are never forced to evaluate the whole tail, just a finite chunk of it.

We can also rewrite `partialSums`

in terms of `foldr`

:

```
partialSums = foldr aux [0]
where aux x res = 0 : map (x +) res
```

This is a mechanical rewriting of the recursive solution^{5}. Since `foldr`

is implemented as the same recursion pattern that we wrote manually before, this works just like the recursive version. Therefore, it still works for infinite lists.

We can prove it using the same logic, we just have to apply `foldr`

first. Recall that `foldr`

is defined as:

```
foldr _ z [] = z
foldr f z (x : xs) = f x (foldr f z xs)
```

Now we can evaluate the new `partialSums`

again:

```
partialSums [1..] == -- by definition `partialSums`
foldr aux [0] [1..] == -- pattern matching in `foldr`
foldr aux [0] (1 : [2..]) == -- by definition of `foldr`
aux 1 (foldr aux [0] [2..]) == -- apply the same logic recursively
aux 1 (aux 2 (foldr aux [0] [3..])) ==
aux 1 (aux 2 (aux 3 (foldr aux [0] [4..]))) == -- expand the `aux` calls
aux 1 (aux 2 (0 : (map (3 +) (foldr aux [0] [4..])))) ==
aux 1 (0 : (map (2 +) (0 : (map (3 +) (foldr aux [0] [4..]))))) ==
0 : (map (1 +) (0 : (map (2 +) (0 : (map (3 +) (foldr aux [0] [4..])))))) ==
```

From here this looks pretty much the same as in the recursive version, and is lazy for the same reasons.

Now suppose that we had a non-lazy version of `map`

:

```
badMap f xs =
if length xs == 0 then []
else f (head xs) : badMap f (tail xs)
```

This is logically the same thing as the regular `map`

, but it is no longer lazy, since it calls `length`

which forces the evaluation of the whole list.

If we use `badMap`

to implement `partialSums`

:

```
badPartialSums = foldr aux [0]
where aux x res = 0 : badMap (x +) res
```

The result will no longer be sufficiently lazy, and won't work for infinite lists:

```
take 6 $ badPartialSums [1..] -- never terminates
```

If our function calls a "not lazy enough" function, then it stops being "lazy enough".

Hopefully now the statement we started with makes more sense:

A function on lists is "lazy enough" when for all inputs it uses a finite chunk of the input to produce a finite chunk of the output.

Thanks for reading this far. I hope this helped you gain some intuition about laziness and infinite lists. Reach out if you have any further questions or comments.

`foldr`

anyone? ↩- This is not a formal definition, but good enough for our purposes. I'm not trying to be precise and fully explain laziness in Haskell, but just to generate enough intuition for the students' homework. ↩
- We'll ignore for the moment that
`head`

and`tail`

are not safe and should be banished from our lives. I don't want to bring in`Maybe`

, which might be distracting from the point. ↩ - This process of moving through equations is called "equational reasoning". A very useful tool when trying to understand Haskell code. ↩
- I'll spare myself from explaining the workings of
`foldr`

in this post. ↩ - As was noted in the Reddit discussion, I'm actually overloading the word "lazy", when what I actually mean by "lazy enough" is "productive". Since we're using Haskell, laziness is entangled with productivity anyways. I'm still on the fence on whether it's worthwhile introducing this new terminology while in the context of this particular homework set. ↩