How Not to Teach Recursion

1 Some Canonical Examples

1.1 Factorial

1.2 Fibonacci

1.3 Euclid’s Algorithm

1.4 Fractals

1.5 Towers of Hanoi

2 Recursion versus Cyclicity

3 Is Recursion in the Problem?

4 How to Teach Recursion

4.1 Function Follows Form

4.2 One Datatype, Many Problems

4.3 When Recursion Becomes Difficult

4.4 Back to the Canonical Problems

4.5 Going Beyond Loops

4.6 That Was Pretty Dense

We all know how to teach recursion. We’ve done it for decades. We pick some honored, time-tested examples—Fibonacci numbers and factorial being leading candidates—and use them to teach the general idea. They’re so canonical they come directly from the gods: you can find these in books by people like Niklaus Wirth.

But I’m here to tell you they got it wrong, and everyone’s been getting it wrong ever since. Students come away underwhelmed and baffled, and go on to become the next generation of teachers who repeat this process. However, we need not repeat this cycle; we have much better methods.

1 Some Canonical Examples

Let’s start by looking at what’s wrong with our canonical examples.

1.1 Factorial

First off, almost nobody has ever needed to compute a factorial. I’ve been programming for about 35 years and the only times I’ve needed to compute a factorial was when doing recreational mathematics or using programming to explore some especially thorny combinatorics problem. Otherwise, I’ve never had any need for factorials.

Second, the answer isn’t meaningful. Quick, what’s the factorial of 13? Most people have no idea and (related to the above) don’t care. There’s nothing recognizable about the answer. The only reason most programmers recognize the number 3628800 is because we’ve tested factorial on a “big enough number” to confirm that it worked, not because we actually cared about it. If you adopt a curriculum that asks students to develop examples before they write code, students would be hard-pressed to write the answer independently; many would resort to implementing it first and plugging in the answer.

Third, factorial has poor numeric properties in most languages. Unless you have built-in, automatic support for big numbers, you will soon either get answers that print in odd-looking scientific notation, or worse, overflow (on many languages, with default integers, factorial of 17 is -288522240).

Finally, many students have written factorial-like programs before, perhaps even factorial itself, using loops. A natural question they (should!) ask an instructor is, “Why should I do this again?” The answer, inevitably, is, “Because I’m asking you to.” There’s no better way to turn off a student.

So let’s summarize the “recursion experience”: a useless problem, with unrecognizable answers, with wonky-looking behavior, using an unnecessary new technique that you’re forced to use “because”.

1.2 Fibonacci

Let’s turn our gaze to Fibonacci.

First off, almost nobody has ever needed to compute a Fibonacci number, either. I’ve been programming for about 35 years and the only times I’ve needed it was when doing recrea—you see where this is going.I do find them useful to convert beween miles and kilometers, though.

Second, the answer isn’t meaningful, and students would again be hard-pressed to write the answer independently (what, keep track of all those calls?!?). Third, it has poor numeric properties: around 44, Fibonacci is correct, then becomes negative (which is obviously incorrect), then immediately again is a very reasonable-looking positive number that just happens to be completely wrong. The main reason we haven’t cared for decades is because we really just don’t have much use for Fibonacci numbers.

Finally, it has atrocious time-complexity.

So let’s summarize the “recursion experience”: a useless problem, with unrecognizable answers, with wonky-looking behavior, that runs really sloooowly (and worse, the slowness first creeps up on you stealthily and then suddenly administers a hammer-blow to the back of your head). In fact, its main value is as a function that you should not write (straightforwardly) recursively.

If you want a bonus reason: you can compute it with a closed form.

This is how we teach recursion?

1.3 Euclid’s Algorithm

Oh, I’m not done.

Another famous mathematician, another famous function.

Okay, I’ll grant, many students have at least heard of the greatest common divisor. They may vaguely recall using it in school for dealing with fractions. So it has familiarity going for it.

Unfortunately, it was useful in a context that no longer makes sense. Once you have a programming language, you have a calculator. Unless you’re implementing a calculator (and perhaps even then), you have no real need to implement Euclid’s algorithm.

And even if you do, admit it: you don’t remember why Euclid’s algorithm works. I certainly have to re-derive why it produces the right answer every time I’ve implemented it (purely for illustrative purposes, every few years). Otherwise, it’s just a mystical pattern.

So what’s the lesson we get from Euclid’s algorithm? That recursion is for encoding mystical patterns for no-longer-useful functions? In fact, is recursion useful for anything other than weird math functions?

1.4 Fractals

I know! I know! Fractals!

By now you should be getting good at this, but here’s a cheat sheet:

More esoteric math.
There’s no meaningful termination condition! That’s the point! So you have to invent one.
The recursion is generative. (Read on.)
If you want to see anything interesting, you need to add graphics to the mix.
Maybe they’ll be retro someday, but for now, fractals were cool when you were in college, gramps, they aren’t any longer.

1.5 Towers of Hanoi

Ah yes, not math! Maybe that’s why we have this old standby.

First, we’ll start with some invented Eastern mysticism. Orientalism is always a good pedagogic device, right?

Next, it’s a problem fraught with questions of representation: how exactly do you represent the content of the towers? That is actually a somewhat interesting question for a beginning student, but (a) it’s not obvious until you’ve had some programming practice, and (b) it has absolutely nothing to do with recursion. So you have a hard, unrelated sub-problem in the middle of your example that “demonstrates” recursion. Doesn’t that utterly violate the pedagogic principle of changing just one thing at a time so students can focus on what’s salient? (And don’t forget, this is another problem with exponential time behavior!)

Furthermore, if you don’t tell students the (recursive) solution and leave them to figure it out for themselves, you haven’t given them a programming problem: you’ve given them a puzzle. Those are different things. In particular, the Towers of Hanoi is a particularly difficult kind of recursion—oh, you didn’t know there are different kinds of recursion? Read on.

Finally, again, does anyone care? Do students believe they would be called on to help monks move disks around? What other problem is similar to this one? And if there are any, do we teach them, or do we present these magnificently isolated monks…in magnificent isolation?

What’s worse is that so many of these are fundamentally static problems: there are only so many interesting inputs, and each one has a completely deterministic solution, so once one builds up a table of outputs for standard inputs, the problem is essentially solved. For such problems, it would be smart (especially when the run-time complexity is exponential in the input) to just stash the answers and never run the code again.Yes, I know, memoization.

2 Recursion versus Cyclicity

Another standard, time-honored pedagogic device is the dumb joke: “to understand recursion you must understand recursion”, and so on.

These confuse recursion with cyclicity.

If you don’t understand the difference, don’t use the dumb jokes.

If you do understand the difference, don’t use the dumb jokes then either. They’re still dumb jokes.

3 Is Recursion in the Problem?

An attempt at a better answer might be that a problem is “inherently” recursive. However, that’s just a limitation of viewpoint. There’s nothing more inherently recursive than iterative about factorial. All the problems above can be expressed even more declaratively, as in a mathematical specification, that eliminates any implementation directive (e.g.: the greatest common divisor is the greatest, common, divisor: the definition of the problem says nothing about how to find it, and searching through all numbers to find divisors that are common and taking the largest one is no less valid a solution).

No, the essence lies elsewhere. But we’re getting closer.

4 How to Teach Recursion

Wait up, I didn’t promise this. Check the title again.

But I know, I can’t just stop here. So I’ll give you a brief peek of how to proceed.

In How to Design Programs (HTDP), we have a rather different view of recursion. The key idea is this.

4.1 Function Follows Form

Where does recursion come from? HTDP argues that it arises from self-references in data. That is, recursive data suggest recursive solutions. This is the key insight you need for understanding recursion. Not only does it make sense once you think about it, it also demonstrates why most other approaches to teaching recursion are essentially incorrect.

HTDP teaches a design “recipe”. In it, you describe your program’s (or function’s) data structures, and identify self-references in these data. Self-referential data are pretty straightforward: even kids can understand them. For instance, even a child informed about biology can answer basic questions like these:

- Does a child have (biological) parents?

- Yes.

- How many?

- Two, a (biological) male and (biological) female.

- Does the female have parents?

- Yes.

- How many?

- Two, a (biological) mother and a (biological) father.

- Does the male…

They can see where this goes. They grasp the idea of a (biological) family tree pretty intuitively. They can similarly see other kinds of self-referential data even at a young age, well before they program.

Next, HTDP explains how the structure of the data suggest a structure to the solution. This solution structure is generic, and called the “template”. You arrive at this entirely mechanically from the data structure, even before you’ve contemplated the exact problem you’re trying to solve.

(The template is not a rule, it’s a suggestion. It helps you overcome the “blank page” problem by offering the outline of a suggestion. Sometimes, the template leads to a correct but insufficiently-efficient solution. Such a solution is still useful as a reference solution against which to test more efficient proposals.)

Separately, you write down examples of your problem. In the process you explore how the self-reference in the solution (what one would call “recursion”) can help you solve the task. (Programming and Programming Languages explores this in more detail.)

And that’s where recursive functions come from. This form of recursion is called structural recursion, because the recursive form follows the structure of the datum.

4.2 One Datatype, Many Problems

One of the added advantages of working with recursive data is that a given datatype will often generate many problems. This has the virtue of seting up comparisons and contrasts. A student’s effort at getting familiar with the datatype and writing good examples of it can pay off in multiple settings.

Students can also be exposed to different datatypes for the same problem space (as I discuss below). This lets them contemplate the benefits and disadvantages that each representation has over the others. This is a problem of general value, unrelated to recursion, that hasn’t been studied enough, but it also lends itself particularly well to recursive settings.

4.3 When Recursion Becomes Difficult

Later, HtDP teaches generative recursion, for when structural recursion is insufficient. Generative recursion requires an “a ha!”, because you have to come up with the non-structural solution.Why the term “generative”? Read the book. Consider, for instance, lists defined recursively with a singleton head and the rest of the list as a tail. For such lists, there is a natural sorting algorithm: the one that follows structurally.

Exercise
Which one is it?

I say it’s “natural” because it’s the one you would arrive at with the least work and that best reflects the structure of the datum. Other standard sorting solutions require some generative step.

However, what is structural varies, naturally, by the structure. Imagine you instead represent lists by appending two sub-lists.

Exercise
What is the natural (structural) sorting algorithm now?

4.4 Back to the Canonical Problems

In principle, natural numbers are also recursive data: a number is either zero or the successor of another natural number. However, this way of thinking is not natural to students (most don’t see natural numbers and immediately think, “Oh, a self-referential datatype!” the way they learn to with, say, a tree). So HTDP introduces recursion over the naturals much later than other books do. It’s just not that interesting.

Given this setup, we can actually see problems like factorial and Fibonacci as structurally recursive functions. Euclid’s Algorithm is still not structural—it’s very much generative. Towers of Hanoi sits in an interesting middle ground. But we’ve already seen why those problems are awful, so there’s no need to dwell on them further!

4.5 Going Beyond Loops

The power of recursion lies not only in capturing certain patterns of data, but also in generalizing loops. You can accumulate data, as you would with loops, but you can also return data, even return data as you process. But recursion is even more useful than that: you can recur mutually, capturing rich patterns of interconnection in data. For an unusual example of this, see this paper.

4.6 That Was Pretty Dense

It sure was! I’ve tried to distill a few hundred pages down to about one. The un-distilled HTDP is available, fully and for free, online.