# Recursion schemes fundamentals

- by Oli Makhasoeva
- •
- January 02, 2020
- •
- scala• recursion schemes• skeuomorph
- |
- 43 minutes to read.

## Motivation

Recursive structures appear in many problems, from databases to machine learning, and writing functions to operate over them is not always a simple task. As functional programming tries to abstract as many things as possible, it offers a way to decouple a recursion from the implementation of business rules.

## Overview

We will start our discussion with the recursive data structure that everyone uses in every project!
I am talking about a list. We are going to copy-paste two functions—`foldRight`

and `unfold`

—from the Scala standard library.
`foldRight`

applies a binary operator to all elements of a list and a start value, going right to left; it returns a start value if a list is empty.
`unfold`

produces a collection that uses a function to produce elements and update an internal state.
How do these functions relate to recursion schemes? They are, in fact, mere special cases of more general functions that are named `cata`

and `ana`

! To see convincing proof of this, please follow the steps of refactoring we show below.

`foldRight`

and `unfold`

definitions

As a starting point, let’s use the definitions of `foldRight`

and `unfold`

(op:(A,B)=>B):B)

```
def foldRight[B](z: B)(op: (A, B) => B): B
```

```
def unfold[A, S](init: S)(f: (S) => Option[(A, S)]): List[A]
```

We need to make some changes to the `foldRight`

before we get started. It is defined as a function of `List[A]`

; the following definition is equivalent:

```
def foldRight[A, B](init: List[A])(z: B)(op: (A, B) => B): B
```

We just passed a list as a parameter.

`foldRight`

and `unfold`

are duals of each other

If you squint hard enough, you will notice that `foldRight`

and `unfold`

are duals of each other. It would be instrumental to rename type parameters to make the duality more obvious later.

```
def foldRight[E, B](init: List[E])(z: B)(op: (E, B) => B): B
def unfold[E, A](init: A)(f: (A) => Option[(E, A)]): List[E]
```

Here, `foldRight`

takes a list of E and produces a value, whereas `unfold`

does the opposite, i.e., it takes a value and produces a list of E.

`foldRight`

and `unfold`

implementations

To implement these functions, we are going to use the close-up technique, i.e., we are going to stare at the signatures of the functions until we have an idea for using what we are given step by step. That is possible because we are talking about FP, which means all our functions are pure and we don’t have anything but the input to produce the output.

`foldRight`

implementation

First, let’s take a look at `foldRight`

’s parameters.

```
init: List[A]
z: B
op: (A, B) => B
```

The return type of the `foldRight`

is `B`

, and considering the input, we could, for instance, always return `z`

.

```
def foldRight[B](init: List[A])(z: B)(op: (A, B) => B): B = z
```

Even though it compiles (always a good sign!), our function doesn’t do anything helpful in that case.
At second glance, we notice that we could pattern match on the `init`

.

```
def foldRight[B](init: List[A])(z: B)(op: (A, B) => B): B =
init match {
case Nil => ???
case head :: tail => ???
}
```

The first case `case Nil =>`

is suggesting that we don’t have a value to apply to `op`

, so the only thing that we could do is return the `z`

. But what do we do with the second case where we have the `head`

and the `tail`

?
We could mistakenly return `op(head, z)`

and drop the `tail`

since it is not type of `B`

. Unfortunately, the compiler isn’t able to warn us if we forget the recursive call altogether. For example, if we return `z`

in both cases, for `Nil`

and `head :: tail`

.
So we need to think harder; what can we do with the `tail`

in order to make it be type of `B`

?
Right, we can recursively call `foldRight`

on it!

```
def foldRight[A, B](init: List[A])(z: B)(op: (A, B) => B): B =
init match {
case Nil => z
case head :: tail => op(head, foldRight(tail)(z)(op))
```

#### Example

To check the behavior of the function, let’s introduce the `prodOp`

function, which multiplies two integers.

```
val prodOp: (Int, Int) => Int = _ * _
foldRight(1 :: 10 :: 20 :: Nil)(1)(prodOp) // 200: Int
```

Let’s take a closer look at the order of operations.
`::`

adds an element at the beginning of the list. Let’s rearrange it so that the order in which it executes is clearer.

```
::(1,
::(10,
::(20,
Nil)))
```

In this trace, we start with a `Nil`

, and then prepend other elements step by step. Similarly, we start our computation at the very bottom of the data structure, and then, as we track back, we do evaluation:

```
prodOp(1,
prodOp(10,
prodOp(20,
1)))
prodOp(1,
prodOp(10,
20))
prodOp(1,
200)
200
```

Note that `foldRight`

effectively replaces every `::`

with `prodOp`

.

### Implementation of `unfold`

The implementation of `unfold`

has similar steps, and it would be a good exercise for you to implement it on your own.

##
Implementation of `unfold`

```
def unfold[E, A](init: A)(f: (A) => Option[(E, A)]): List[E] =
f(init) match {
case None => Nil
case Some((e, a)) => e :: unfold(a)(f)
}
```

#### Example

Let’s use a function that generates a list of descending integers starting with the given integer.

```
val rangeOp: Int => Option[(Int, Int)] =
v => {
if (v <= 0) None
else Some((v, v - 1))
}
unfold(10)(rangeOp) // List(10, 9, 8, 7, 6, 5, 4, 3, 2, 1): List[Int]
```

Like `foldRight`

, `unfold`

goes down in a computation step by step until it can compute the next integer, and then it combines the results into a list.

## Changing `foldRight`

to show the duality better

```
def foldRight[E, B](init: List[E])(z: B)(op: (E, B) => B): B
def unfold[E, A](init: A)(f: (A) => Option[(E, A)]): List[E]
```

It’s easy to see that `foldRight`

takes more parameters than `unfold`

. Can we do anything about it?
We could think of `z`

, which gives us a value of type `B`

, as a function from `Unit`

to `B`

.

```
z ---------> B
() ----z----> B
```

the `op`

function also returns a value of B:

```
(E, B) ----op----> B
```

We could combine them together, taking a unit or a tuple of `(E, B)`

, to think of `Either`

as a representation of such a type in Scala.

```
Either[(), (E, B)] -----f----> B
```

Moreover, we could go further and replace `Either`

with `Option`

, because it doesn’t make much sense to keep `Either`

with a unit on the left side, being that `Either[Unit, A]`

is isomorphic to `Option[A]`

.
Consequently, the next two definitions are equivalent:

```
def foldRight[E, B](init: List[E])(z: B)(op: (E, B) => B): B
def foldRight[E, B](init: List[E])(f: Option[(E, B)] => B): B
```

Let’s reimplement `foldRight`

:

```
def foldRight[E, B](init: List[E])(f: Option[(E, B)] => B): B =
init match {
case Nil => f(None)
case head :: tail => f(Some((head, foldRight(tail)(f))))
}
```

Nothing has changed dramatically except that we call our function `f`

in both cases now.
We also need to massage the example to satisfy a new signature.

```
val prodf: Option[(Int, Int)] => Int = {
case None => 1 // Hello, `z`!
case Some((x, y)) => x * y
}
```

And apply it to our function.

```
foldRight(1 :: 10 :: 20 :: Nil)(prodf) // still 200: Int
```

## Refactoring 1: pass fewer parameters

We are moving in the right direction, and now is an excellent time to revisit our functions.
What problems do we have?
Unfortunately, in both cases, we are passing our initial values and a function `f`

on every step of the recursion.

```
def foldRight[E, B](init: List[E])(f: Option[(E, B)] => B): B =
init match {
case Nil => f(None)
case head :: tail => f(Some((head, foldRight(tail)(f)))) // here `tail` and `f` are passed around
}
def unfold[E, A](init: A)(f: (A) => Option[(E, A)]): List[E] =
f(init) match {
case None => Nil
case Some((e, a)) => e :: unfold(a)(f) // here `a` and `f` are passed around
}
```

The second concern is that we have curried paramers and, for better readability, we would like to avoid it.

We could get rid of both problems by (1) returning a function instead of a value and (2) introducing a nested function:

(1):

```
def foldRight[E, B](f: Option[(E, B)] => B): List[E] => B
def unfold[E, A](f: (A) => Option[(E, A)]): A => List[E]
```

Notice, there is a striking resemblance between these two signatures! They mirror each other:

```
Option[(E, B)] => B => List[E] => B
A => Option[(E, A)] => A => List[E]
```

Because the function we need to return is recursive, we have to build it using an auxiliary variable.

(2):

```
def foldRight[E, B](f: Option[(E, B)] => B): List[E] => B = {
lazy val kernel: List[E] => B = _ match {
case Nil => f(None)
case head :: tail => f(Some((head, kernel(tail))))
}
kernel
```

##
Implementation of `unfold`

```
def unfold[E, A](f: (A) => Option[(E, A)]): A => List[E] = {
lazy val kernel: A => List[E] = f(_) match {
case None => Nil
case Some((e, a)) => e :: kernel(a)
}
kernel
```

That works, but is there a way to handle it without insufficient lazy val? Yes, we could also do it using an anonymous nested function.

```
def foldRight[E, B](f: Option[(E, B)] => B): List[E] => B =
new (List[E] => B) { kernel =>
def apply(init: List[E]): B = init match {
case Nil => f(None)
case head :: tail => f(Some((head, kernel(tail))))
}
}
```

You are strongly encouraged to venture out and try to write an implementation for the `unfold`

on your own!

##
Implementation of `unfold`

```
def unfold[E, A](f: (A) => Option[(E, A)]): A => List[E] =
new (A => List[E]) { kernel =>
def apply(init: A): List[E] = f(init) match {
case None => Nil
case Some((e, a)) => e :: kernel(a)
}
}
```

## Refactoring 2: factor out the data structure

So far, our implementation is tightly coupled to a list—our data structure. While we cleaned up the code a bit, we didn’t generalize our functions.
To better understand where the `List`

type appears, let’s
break down our function into steps. The point here is to illustrate which steps depend on the `List`

and eliminate them.
Let’s start with `foldRight`

. The first step it does is pattern matching on a list.

```
def foldRight[E, B](f: Option[(E, B)] => B): List[E] => B =
new (List[E] => B) { kernel =>
def apply(init: List[E]): B = init match {
case Nil => ???
case head :: tail => ???
}
}
```

We can think of it as unpacking our data structure, or projecting a data structure.
It is proceeding with a recursive call in the case the list is not empty (see `kernel`

call).

```
def foldRight[E, B](f: Option[(E, B)] => B): List[E] => B =
new (List[E] => B) { kernel =>
def apply(init: List[E]): B = init match {
case Nil => ???
case head :: tail => ??? kernel(tail)
}
}
```

Finally, we are computing the result using `f`

.

```
def foldRight[E, B](f: Option[(E, B)] => B): List[E] => B =
new (List[E] => B) { kernel =>
def apply(init: List[E]): B = init match {
case Nil => f(???)
case head :: tail => f(???)
}
}
```

To summarize it, let’s name the steps:

- Unpacking/projecting data structure
- Recursion: call self for the nested structure
- Computation

How can we reimplement our `foldRight`

? The idea is simple. Instead of having one function `apply`

, we will introduce three new functions: one per each step.
This is a slight shift in perspective:

```
def foldRight[E, B](f: Option[(E, B)] => B): List[E] => B = {
new (List[E] => B) { kernel =>
def step1: ??? => ??? // unpack
def step2: ??? => ??? // recurse
def step3: ??? => ??? // compute
def apply(init: List[E]): B = step3(step2(step1(init)))
}
}
```

As is already apparent, `step1`

should take `List[E]`

as a parameter. Given that we can only end up with a `Nil`

or a `head`

and a `tail`

, the most general possible return type could be `Option[E, List[E]]`

. It forces us to accept it as a parameter for `step2`

.
It should also be noted that `step3`

is essentially `f`

and, therefore, we can state its signature as `def step3: Option[(E, B)] => B`

. `step3`

’s parameter certanly gives us the returning type of `step2`

. Let’s write down functions with the required signatures:

```
def foldRight[E, B](f: Option[(E, B)] => B): List[E] => B = {
new (List[E] => B) { kernel =>
def step1: List[E] => Option[(E, List[E])] // unpack
def step2: Option[(E, List[E])] => Option[(E, B)] // recurse
def step3: Option[(E, B)] => B // compute
def apply(init: List[E]): B = step3(step2(step1(init)))
}
}
```

It’s now considerably simple to write all functions:

```
def step1: List[E] => Option[(E, List[E])] = {
_ match {
case Nil => None
case head :: tail => Some((head, tail))
}
}
def step2: Option[(E, List[E])] => Option[(E, B)] = {
_ match {
case None => None
case Some((e, le)) => Some((e, kernel(le)))
}
}
def step3: Option[(E, B)] => B = f
```

We do not benefit from having `step3`

because it is identical to `f`

, so we can remove it and use `f`

directly instead.

Our goal is to get you accustomed to working with more abstract structures, and develop the ability to recognize them.
In the next code snippet, we’d like to introduce type aliases `F[P]`

and `S`

to decrease a distraction and focus on an abstraction.

```
def foldRight[E, B](f: Option[(E, B)] => B): List[E] => B = {
type F[P] = Option[(E, P)]
type S = List[E]
new (S => B) { kernel =>
def step1: S => F[S] = {
_ match {
case Nil => None
case head :: tail => Some((head, tail))
}
}
def step2: F[S] => F[B] = {
_ match {
case None => None
case Some((e, le)) => Some((e, kernel(le)))
}
}
def apply(init: S): B = f(step2(step1(init)))
}
}
```

The `step2`

resembles a `Functor`

; indeed, it’s almost like mapping over an `Option`

. To be precise, it’s mapping over the second parameter of the tuple that is wrapped in an `Option`

. Leveraging some functionality from `cats`

library, we can write quite generic code!

```
def foldRight[E, B](f: Option[(E, B)] => B): List[E] => B = {
type F[P] = Option[(E, P)]
type S = List[E]
implicit val F = Functor[Option].compose[(E, ?)]
new (S => B) { kernel =>
def step1: S => F[S] = {
_ match {
case Nil => None
case head :: tail => Some((head, tail))
}
}
def step2: F[S] => F[B] = _.fmap(kernel)
def apply(init: S): B = f(step2(step1(init))) // or f(step1(init).fmap(kernel)), if we inline the step2
}
}
```

Now we can claim that the only remaining dependency on the `List[E]`

is our `step1`

function. Thus, if we have it outside of the `foldRight`

, it will make it as abstract as possible. Let’s refactor this further and move everything outside.
The first thing to refactor is our type `F[P] = Option[(E, P)]`

.
We can use the following pattern (also known as a functor-pattern):
`type ListF[A, B] = Option[(A, B)]`

.
The second step would be to get rid of `List[E]`

and `E`

entirely in the signature and leave `S`

as a type parameter of `foldRight`

to represent our data structure.
The functor restriction also migrates to the type parameters.
Now we can factor out the `step1`

and call it `project`

, as it is projecting our data structure:

```
type ListF[A, B] = Option[(A, B)]
implicit def functor[A]: Functor[ListF[A, ?]] = Functor[Option].compose[(A, ?)]
def projectList[E]: List[E] => ListF[E, List[E]] = {
_ match {
case Nil => None
case head :: tail => Some((head, tail))
}
}
def foldRight[F[_]: Functor, S, B](f: F[B] => B)(project: S => F[S]): S => B = {
new (S => B) { kernel =>
def apply(init: S): B = f(project(init).fmap(kernel))
}
}
```

To help the Scala compiler a little bit, we need to change the types of the `prodF`

function explicitly.

```
val prodFlist: ListF[Int, Int] => Int = {
_ match {
case None => 1
case Some((x, y)) => x * y
}
}
```

We are ready to examine our new brand `foldRight`

:

```
foldRight(prodFlist)(projectList).apply(1 :: 10 :: 20 :: Nil) // 200: Int
```

The result, of course, stays the same. But look how more abstract it is now!
We should do exactly the same thing with the `unfold`

, right?
Here are a few hints for solving this exercise for `unfold`

:

- computation
- recursion
- packing/embedding into a data structure

##
Implementation of `unfold`

```
def embedList[E]: ListF[E, List[E]] => List[E] = {
_ match {
case None => Nil
case Some((e, le)) => e :: le
}
}
val range: Int => ListF[Int, Int] =
v => if (v <= 0) None else Some((v, v - 1))
def unfold[F[_]: Functor, S, A](f: (A) => F[A])(embed: F[S] => S): A => S =
new (A => S) { kernel =>
def apply(init: A): S = embed(f(init).fmap(kernel))
}
unfold(range)(embedList).apply(10) // List(10, 9, 8, 7, 6, 5, 4, 3, 2, 1): List[Int]
```

## Refactoring 3: here comes cata and ana

Whenever you generalize functions like this, take a critical look at your generalized function when you have finished. Although the function may have been motivated by some specific use case(`List`

), the signature and implementation may have a more general meaning. In this case, `foldRight`

and `unfold`

are perhaps no longer the most appropriate names for those functions. These functions, which often come up in recursion schemes libraries, could be called `cata`

and `ana`

.

```
def cata[F[_]: Functor, S, B](f: F[B] => B)(project: S => F[S]): S => B =
new (S => B) { kernel =>
def apply(init: S): B = f(project(init).fmap(kernel))
}
def ana[F[_]: Functor, S, A](f: (A) => F[A])(embed: F[S] => S): A => S =
new (A => S) { kernel =>
def apply(init: A): S = embed(f(init).fmap(kernel))
}
cata(prodFlist)(projectList).apply(1 :: 10 :: 20 :: Nil) // 200: Int
ana(range)(embedList).apply(10) // List(10, 9, 8, 7, 6, 5, 4, 3, 2, 1): List[Int]
```

`F[A] => A`

is often called `Algebra`

in the category theory/recursion schemes literature, and its dual `A => F[A]`

is called `Coalgebra`

.
Let’s introduce aliases for these types:

```
type Algebra[F[_], A] = F[A] => A
type Coalgebra[F[_], A] = A => F[A]
```

## Cata and Ana with Algebra & Coalgebra

```
val productOpA: Algebra[ListF[Int, ?], Int] = {
case None => 1
case Some((x, y)) => x * y
}
val rangeOpC: Coalgebra[ListF[Int, ?], Int] =
n => if (n <= 0) None else Some((n, n - 1))
def cata[F[_]: Functor, S, B](algebra: Algebra[F, B])(project: Coalgebra[F, S]): S => B =
new (S => B) { kernel =>
def apply(input: S): B =
algebra(project(input).fmap(kernel))
}
def ana[F[_]: Functor, S, A](coalgebra: Coalgebra[F, A])(embed: Algebra[F, S]): A => S =
new (A => S) { kernel =>
def apply(init: A): S =
embed(coalgebra(init).fmap(kernel))
}
def projectListC[A]: Coalgebra[ListF[A, ?], List[A]] = {
case Nil => None
case head :: tail => Some((head, tail))
}
def embedListA[A]: Algebra[ListF[A, ?], List[A]] = {
case None => Nil
case Some((head, tail)) => head :: tail
}
cata(productOpA)(projectListC).apply(1 :: 10 :: 20 :: Nil) // 200: Int
ana(rangeOpC)(embedListA).apply(10) // List(10, 9, 8, 7, 6, 5, 4, 3, 2, 1): List[Int]
```

## Use in practice: Skeuomorph

Throughout this post, we’ve used these two examples:

```
{
val rangeList: Int => List[Int] = ana(rangeOpC)(embedListA)
rangeList(10)
val productList: List[Int] => Int = cata(productOpA)(projectListC)
productList(1 :: 10 :: 20 :: Nil)
}
```

And if you look at the signatures of `cata`

and `ana`

, they look exactly the same, and we can compose them! We can come up with really clean and fun factorial implementation:

```
{
val rangeList: Int => List[Int] = ana(rangeOpC)(embedListA)
val productList: List[Int] => Int = cata(productOpA)(projectListC)
val factorial: Int => Int = productList compose rangeList
factorial(4)
}
```

This operation, unfolding a data structure from a seed value (coalgebra), then computing a final result by folding over the data structure (algebra),
is called a hylomorphism (or shorter, `hylo`

).

```
def hylo[F[_]: Functor, A, B](algebra: Algebra[F, B], coalgebra: Coalgebra[F, A]): A => B =
new (A => B) { kernel =>
def apply(init: A): B =
algebra(coalgebra(init).fmap(kernel))
}
{
val factorial: Int => Int = hylo(productOpA, rangeOpC)
factorial(4)
}
```

The canonical example is the factorial function;
however, `hylo`

, `ana`

, and `cata`

are far more useful than another fancy method to compute factorial.
Think of any operation that needs to build, and then collapse, a data structure. One such operation that comes to mind, and that you may encounter in production, is the transformation of different schemas in Scala.
At 47 Degrees, we open-sourced a library called `Skeuomorph`

, which solves this problem. Let’s discuss how it does it briefly.

Suppose we need to parse a `.proto`

file and convert it to our internal representation.

### Parser as an anamorphism in Skeuomorph

First things first, we need to write a pattern functor for our structure. It can look similar to the following:

```
sealed trait ProtobufF[A]
object ProtobufF {
final case class TUint64[A]() extends ProtobufF[A]
final case class TBool[A]() extends ProtobufF[A]
final case class TString[A]() extends ProtobufF[A]
final case class TNamedType[A](name: String) extends ProtobufF[A]
final case class TRepeated[A](value: A) extends ProtobufF[A]
final case class TMessage[A](name: String, fields: List[FieldF[A]]) extends ProtobufF[A]
```

Please, keep in mind, this is a simplified version.

Once we have the pattern functor, we can complete `Coalgebra`

:

```
import higherkindness.droste.Coalgebra
def fromFieldTypeCoalgebra(): Coalgebra[ProtobufF, Type] = Coalgebra {
case Type.TYPE_BOOL => TBool()
case Type.TYPE_STRING => TString()
case Type.TYPE_UINT64 => TUint64()
case Type.TYPE_MESSAGE => ???
???
}
```

Here, on the left-hand side, we can see native protobuf types that correspond to `ProtobufF`

types on the right-hand side. Therefore, our program now knows how to map types.

As we mentioned before, `cata`

and `ana`

are implemented in `Droste`

library. Let’s borrow `scheme.ana`

and `Embed`

from it to implement the parser:

```
def fromFieldType[A]()(fieldType: Type)(coalg: Coalgebra[ProtobufF, Type])(implicit A: Embed[ProtobufF, A]): A =
scheme.ana(coalg).apply(fieldType)
```

### Transformations with `cata`

in Skeuomorph

We know we need to have an internal representation of schema:

```
object SchemaF {
final case class Field[A](name: String, tpe: A)
final case class TLong[A]() extends SchemaF[A]
final case class TBoolean[A]() extends SchemaF[A]
final case class TString[A]() extends SchemaF[A]
final case class TNamedType[A](name: String) extends SchemaF[A]
final case class TList[A](value: A) extends SchemaF[A]
final case class TProduct[A](name: String, fields: List[Field[A]]) extends SchemaF[A]
}
```

We introduced a new type here, SchemaF. Now we have to determine how to transform from the pattern functor `ProtobuF`

to that type:

```
/**
* transform Protobuf schema into SchemaF
*/
def transformProto[A](implicit A: Embed[SchemaF, A]): Trans[ProtobufF, SchemaF, A] = Trans {
case ProtobufF.TUint64() => TLong()
,,,
case ProtobufF.TRepeated(value) => TList(value)
case ProtobufF.TMessage(name, fields) => TProduct(name, fields.map(f => Field(f.name, f.tpe)))
}
```

And `cata`

will traverse through the data structure and return the result in SchemaF terms.

```
scheme.cata(transformProto[U].algebra)
```

We could implement a parser or a transformation for any other types following the same simple steps, or even more, we can transform from any kind of protocol to any other using SchemaF as an intermediate step! In future blog posts, we will cover the topic in more detail. To play more with it, we invite you to check out the Ersatz project on our Github page. There, you can run an example and see the results. Ersatz is a simplified version of Skeuomorph; experienced readers should proceed to it directly.

## Summary

Recursion schemes is a compelling technique that separates the business logic - code that everyone wants to focus on - from the recursion pain: By decoupling how a function recurses over data from what the function actually does, we reduce cognitive overhead and can focus entirely on the core behavior of our recursive functions. Recursion schemes are abstract from data structures. It doesn’t matter whether we work with lists, trees, graphs, database schemes, protocol buffers, or avro. They allow traversing these different structures in the same way.

## Acknowledgments

Huge thanks to Andy Scott for his help in preparation and valuable feedback.