What's the Point of Learning Functional Programming?

"What's the point of learning functional programming?" was a genuine question I got from a student on a functional programming course I was TAing on.

But let's rewind for a bit of background first. As I found myself standing in front of a frightened looking class, reviewing some Haskell basics, I was starting to feel guilty¹ for overwhelming them with all these foreign new concepts. Recursion, currying, function composition, algebraic data types, and the list goes on and on. So it felt natural to give them an escape hatch.

I mean, if all you know in life are loops, how can you possibly make do with just recursion? So when it was time for a new homework assignment we gave them a hint along the following lines:

Try solving with Pythonish pseudocode first, and every time you have a loop, you can convert it to a tail recursive function as follows...

Proceeding to explain how to convert loops to tail recursion². At that, some of the students seemed mildly relieved, and continued with their homework.

Fast forward after the submissions. A student approached me after class with a question. I was surprised, till now I scarcely seen them outside their Haskell stupor. And here's what the student had to say. "If all we did was write some Python in Haskell syntax, what's the point of learning functional programming?".

Thinking about it, he was right. With the escape hatch we gave them they really could solve the homework by mechanically translating Python code into Haskell. Apart from some new syntax, you're not really learning much of anything new. And that's unfortunate, there's so much to learn from Haskell and functional programming in general. It would be a shame to miss the opportunity.

Can we do better?

The Homework

The homework problem was to solve the knight's tour problem. That is, given a chess board of some arbitrary dimensions and a starting position, find a path that a knight can take that will go through all the cells of the board exactly once.

Efficiency was not the point of this exercise, so the students could just do a brute-force search for the correct path using backtracking. Suppose you're a student and Python is your main weapon of choice. How would you solve this?

Here's the core function of a very naive attempt at brute-forcing the solution³:

# 1
def tour(n, visited, moves, pos):
    if len(visited) == n * n: # 2
        return moves

    for knight_move in all_moves: # 3
        (row, col) = pos
        (dx, dy) = knight_move.value # 4

        next_row = row + dx
        next_col = col + dy

        next_pos = (next_row, next_col)

        # 5
        if is_valid(n, visited, next_row, next_col):
            new_visited = visited + [next_pos] # 6
            new_moves = moves + [knight_move]

            # 7
            result = tour(n, new_visited, new_moves, next_pos)

            # 8
            if result is not None:
                return result
    # 9
    return None

This is not very good code, for various reasons. But that's not really the point, it (slowly) solves the problem as stated, and that's good enough for illustration purposes. Let's review what this does:

tour is a recursive function (1) that takes the current state as input:
- The size of the board n
- The list of already visited coordinates
- The moves we constructed so far
- And a tuple of coordinates pos for the current position
We have a stopping condition (2), if the path we visited covered the whole board, in which case we are done and we return the moves that were constructed
If not, we iterate over all possible knight moves (3), which are defined as a separate enum
For each such move, we construct the next cell that we are going to visit (4)
If the new cell is valid (5), i.e., within the bounds of the board and wasn't visited before
We add the new position to the list of visited cells (6), and the current move to the moves we are building⁴
We then proceed with the next step by recursively calling on tour with the new state (7)
If the result of the recursive call was successful we return that (8)
Otherwise, we backtrack on this attempt and try the next knight move in the list
When we exhausted all the possible knight moves, we give up and return None (9)

This is all good and well, but the actual homework is in Haskell. What good is it to have this Python solution instead?

A Haskell Rewrite

It is true that having a concrete (and correct) solution written out is very helpful in thinking about a problem. But with the tip that we gave the students about loops and recursion, it's even better than that. It's actually fairly mechanical to translate the Python code into Haskell. To wit⁵:

tour n visited moves row col =
  if length visited == n * n
    then Just moves -- 1
    else go allMoves -- 2
 where
  -- 3
  go [] = Nothing
  go (knightMove : rest) =
    let (dx, dy) = moveValue knightMove
        nextRow = row + dx
        nextCol = col + dy
        nextPos = (nextRow, nextCol)
     in if isValid n visited nextRow nextCol
          then
            let newVisited = visited ++ [nextPos]
                newMoves = moves ++ [knightMove]
                result = tour n newVisited newMoves nextRow nextCol
             -- 4
             in case result of
                  Just solution -> Just solution
                  Nothing -> go rest -- 5
          else go rest -- 6

If you're familiar with Haskell syntax, this code should look basically the same as the original Python code. There are a few differences, but they are mostly cosmetic:

Since there's no None (or nulls in general) in Haskell, we have to wrap things with Maybe (1), it's a bit more syntax compared to Python, but not by much
Haskell is expression-oriented, so there are no early returns
As a consequence the stopping condition has an else (2) that invokes the logic for the next step
The main loop over knight moves was replaced by a tail recursive function called go (3), this is our tip in action
We need a bit more ceremony when working with Maybe, so we pattern match on it (4)
Since we are not running in a loop, we must call the next backtracking step explicitly by invoking go (5, 6) with a new argument

If you're a student who just created this Haskell solution from the Python sketch, what new things did you learn?

You might've learned that it's possible to work without nulls. But since this code is small, the consequences of not having null are not very visible, and you mostly get just some added syntactic noise⁶. And not having null is not a uniquely functional thing, e.g., Rust doesn't use null as well.

There's a glimpse of being more expression-oriented here. That's a deep and very impactful principle. But since we just mechanically translated from Python, it's easy to miss its significance. This is even more obscured by the fact that it's difficult to appreciate the consequences of expression-orientedness "in the small".

Lastly, we learned that it's possible to replace iteration with simple recursion. Which is cool and all, but what's the point? for loops work well enough, doing recursion just for that feels like a lousy (and overly general) syntax for something we already have figured out.

If you submit this solution to the problem it won't be out of place to wonder "what's the point?".

Going Functional

Now you may be screaming at the screen that no, functional programming is worth learning, it's not just Python with uglier syntax. And you would be right, there is a lot to learn, even with a simple assignment such as the one we are working with now.

But to get these benefits we must make an effort to break away from our "mainstream" roots, and stop translating the old way into a new syntax. We need to completely change the way we think about the problem and its solution.

For problems like this knight's tour, one good approach is to use what is called "wholemeal programming":

Functional languages excel at wholemeal programming, a term coined by Geraint Jones. Wholemeal programming means to think big: work with an entire list, rather than a sequence of elements; develop a solution space, rather than an individual solution; imagine a graph, rather than a single path. The wholemeal approach often offers new insights or provides new perspectives on a given problem. It is nicely complemented by the idea of projective programming: first solve a more general problem, then extract the interesting bits and pieces by transforming the general program into more specialised ones.

So instead of exploring the solution space step by step, we'll create a list of the whole solution space at once, and then use that list to solve the problem we are interested in. Then I'm sure we'll find something new to learn about programming and problem-solving.

The first issue to tackle is, how can we possibly represent the whole solution space? It might be huge. Lucky for us, Haskell is lazy by default, so we can just pretend that we have the full list, but actually compute it fully only on demand.

Next, since we are going to deal with a state space, what are the actual states that we will be working with?

We can encode a single state in the tour with this record⁷:

data TourState = TourState
  { board :: Board
  , visited :: [KnightPos]
  , moves :: [KnightMove]
  , current :: KnightPos
  }

These are basically the arguments to the original tour function wrapped up in a single record (with some more civilized wrapper types)⁸.

To be able to explore the state space, we need a way to move between states. For that we can define:

moveState :: TourState -> KnightMove -> Maybe TourState

Given a single state and a move, we compute the next state by calculating the new position and updating the list of moves. Since not every move is legal from a given state, we accommodate possible failure with a Maybe. If we can't move to the next state the result will be Nothing.

You can see the implementation in the repo, but it's basically the same logic that we had before, just adapted to using TourState.

With moveState we can apply one move to a state. Next we want to try out all the possible knight moves from a state. The signature will be:

nextStates :: TourState -> [TourState]

From a given TourState generate all possible (legal) states that can be reached by a single step. Here's the implementation:

nextStates state = mapMaybe (moveState state) allKnightMoves

We take all the possible knight moves and with the help of mapMaybe apply moveState from the current state to every element in the list of moves. mapMaybe takes care of removing all the Nothing entries, so that we end up with a flat list with just the legal states we can reach.

This is actually our first example of using wholemeal programming. Notice how we didn't iterate step by step. Instead of thinking about moving between individual knight moves, we applied the moving logic to the whole list with mapMaybe.

With these helpers we are now ready to create a (potentially) infinite list of all the states that we can be in while searching for a tour from the initial state. Concretely, we need to implement the following signature:

allStatesFrom :: TourState -> [TourState]

Given a single state, we create a list of all the states that can be reached from it.

Creating an infinite list of steps might sound intimidating, but we'll do it anyways:

allStatesFrom state = -- 1
  let
    next = nextStates state -- 2
    allTheRest = concatMap allStatesFrom next -- 3
   in
    state : allTheRest -- 4

We start from the initial state (1), and we compute all the legal next states using nextStates (2). This is one step into the search. Next, we want to take the next steps from each of these new states. We do it by recursively⁹ calling allStatesFrom on each of the next states (3). concatMap flattens all the resulting lists into one big list. We construct the full list of states by prepending the initial state to allTheRest with all the deeper steps (4).

And just like that we have a list that represents all the possible states in the problem we are solving. This is a bit mind-bending, take a moment to let the tricky recursion sink in properly.

Notice how "wholesome" this code is. We don't deal with edge cases or stopping conditions, we just generated the whole thing in one fell swoop. Nor do we worry about going off into infinity, laziness prevents this list from being computed prematurely¹⁰.

With this list in hand, it's almost trivial to solve the knight's tour problem. We just need to find the first state from the list that covers the whole board.

We can define the stopping condition like so:

isComplete TourState{board, visited} =
  boardSize board == length visited

This function determines whether a given state is a solution by checking the size of the list of visited positions (which happens to be the same as the stopping condition in the original Python solution).

With these two building blocks in hand, a list of states and a stopping condition, we can easily compose them into a single solution:

tour :: Board -> KnightPos -> Maybe [KnightMove]
tour board pos = -- 1
  let
    -- 2
    init = initState board pos
    -- 3
    finalState = find isComplete (allStatesFrom init)

   in
    fmap getMoves finalState -- 4

Step by step:

The final tour function takes in a board and an initial position (1)
From this we build the initial state (2), which just initializes a fresh instance of TourState with the current position and an empty list of moves
Now for the highlight of this solution, we use find (3) to locate the first TourState that matches isComplete in the full list of states that we generated from the initial state
The final result (4) just extracts the moves from the state that we found (which might be missing, hence the fmap call)

I find this approach to be breathtakingly elegant. Thanks to the classic "Why Functional Programming Matters" paper for the inspiration. Not only does it work, but now we have an opportunity to learn something truly novel.

What Did We Learn?

With such a strikingly different solution, we are bound to learn something new. Here are a few lessons that we can take away from this.

Laziness

To generate the list of states we used laziness in an essential, non-contrived way. Without laziness, the list could potentially overflow memory or take too much time to compute upfront, making it impractical to generate "the whole state space".

Haskell's built-in laziness makes this invisibly seamless (for better or for worse), but the lesson that we learn here is applicable almost anywhere. Once we recognize this need for laziness we can use the appropriate technique to achieve it in pretty much any modern language. Be it generators in Python, streams/iterators in Java, or any streaming library in any language.

Laziness, in its conceptual form, is a powerful tool we can use anywhere. And developing an intuition for it from lazy Haskell lists makes the approach to it relatively gentle.

Explicit State Space

The way we went about exploring the state space forced us to actually think about that space. What are the possible states? How do we move between them?

This was explicitly reified in the TourState type and the moveState function.

The state space exists independently of whether we bother to acknowledge it or not. But my feeling is that when thinking imperatively about code it's much easier to "miss the forest for the trees". Focusing on each step we are making in some iterative loop tends to hide the significance of the state space as a whole from us.

Whether or not you're applying functional programming to your code, keeping in mind the state space can only be beneficial. If only for something like "make illegal states unrepresentable".

Modularity

By separating the exploration of the state space from the stopping condition we made the code much more modular than what it was before. Previously we had to intertwine the stopping condition directly into iteration, but no more.

This kind of modularity opens the door to many possible modifications that we can apply to one part of the code without disturbing the others.

Some examples:

We can change the stopping condition without touching the state space exploration logic. Instead of using find to find the first solution, we could search for all solutions. Or we could search only for "closed tours" (where the final cell is adjacent to the initial cell).
In the solution code we are exploring the state space using a preorder depth-first traversal. We can easily adapt it to be any other depth-first traversal, or with some more effort we can make it into any traversal order we want.
The current algorithm is a naive brute-force approach, we can make things faster by pruning the state space list with heuristics, like applying Warnsdorf's rule.

The important thing is that we can make all these changes by modifying only the code that is responsible for it, without touching its surroundings. This also means that everything can be tested separately without having to test unrelated parts.

Contrast this to the original code where there's no way to modify or test anything without touching everything else. The stopping condition sits smack in the middle of iteration.

That's a level of modularity that takes some effort to achieve with procedural code, but flows out naturally from the wholemeal approach to solving problems.

Composition

Complementing modularity we have composition¹¹.

It's one thing to split things into modules, but another thing is how easy it is to compose the modules back into a single piece of code that actually solves the full problem.

In our case it was very easy, just a call to find along with the state space list and the completion predicate. And the same applies to all the different variants described in the previous section. We can easily mix and match the different implementations to generate new working pieces of software.

Using higher-order functions (find) as the glue for simple, purely-functional (or side-effect-free) components is a great recipe to obtain code with good composition properties.

Quoting from Tony Morris:

Suppose any function that exists. I will suppose one that is made of parts A, B, C and D. I might denote this A ∘ B ∘ C ∘ D. Now suppose you desire a function that is made of parts A, B, C and E. Then the effort required for your function should be proportional to the effort required for E alone.

The closer you are to achieving this objective for any function, the closer you are to the essence of (pure) functional programming. This exists independently of the programming language, though programming languages will vary significantly in the degree to which they accommodate this objective.

Imperative programming – specifically the scope of a destructive update – is the anti-thesis of this objective.

Wholemeal Programming

Tying all these concepts together is wholemeal programming, showing us how to tackle problems with a wholly different mindset.

Not only does it force us to think differently about problem-solving, but it brings concrete, practical advantages to the code we produce.

Although it's possible to see every one of these advantages in other paradigms, functional programming really shines in making those advantages stand out. Once you learn about them, you can apply them pretty much anywhere. You just need to stop mechanically translating your existing knowledge and break away from everything you think you know.

So what's the point of learning functional programming? You tell me...

If you enjoyed this, reach out for a workshop on Functional Programming, where I teach how to apply Functional Programming to improve code quality in any language.

Permalink Twitter Reddit LinkedIn Hacker News Facebook WhatsApp

✓ Link copied to clipboard!

I'm sure the guilt fades away over time, but I was doing this for only one semester. ↩
Funny how usually programmers find themselves going in the opposite direction. Let's ignore for now the fact that in Haskell tail recursion is not that important. ↩
The full code for all the examples that follow can be found on GitHub. The Python solution here. ↩
Appending to a list in a loop like this is quadratic, which might slow down the implementation. Can you think of a better way to do this while still using a simple list? ↩
See the full code here. ↩
Nor do we use any of the special Maybe functions like map. ↩
See the full code here. ↩
We are still keeping a naive mindset towards performance so as to not distract from the main point. But playing around with the types of the record can yield some speedups (e.g., not using lists for sets). ↩
Recursion is sometimes called "the goto of functional programming". One might attempt to remove the explicit recursion in favor of something more structured like unfold. ↩
The state space for this particular problem is actually finite. But that's beside the point, since it can be very large for sufficiently large boards. For such boards actually computing the whole list would likely crash the program. ↩
I'm not using the term "compositionality" because it has a more specific technical meaning which I'm not trying to imply here. ↩

# What's the Point of Learning Functional Programming?