Last Modified on Jul 18, 2021

- (2021.07.24-25) Fixed/Reworked `cata` examples (per r/Tarmen) and changed definition of example tree.

Code for this project can be found in my experiments github repo (probability-tree-schemes folder).

The goal of this post is to present a simple example that uses recursion schemes. My other goal is to discuss my thoughts about reasons for getting stuck on finding the right solution. It will be a bit of a rant.

**Motivation:**

Last week I encountered a programming problem that kept defeating me. It took me forever just to figure out how to translate the requirements into something I can code. The gist of this problem turned out to be very straightforward: implement a probability tree diagram (a decision tree annotated with probabilities). Well not really, but close enough.

I think, now, that the problem is simple, but it was not the case during the few hours I initially have spent on it. This experience was somewhat surprising to me as I have worked for several years with decision graphs (not trees, but still). I just I failed to make the association.

I am sharing my experience in this post. This post shows the solution program, and as such, ended up becoming a “tutorial” of sorts about the recursion schemes. I tried to make it easy to read even if some Haskell code is more at an intermediate level. In this tutorial I will get in trouble couple of times arriving at an incorrect solution.

At the end of this post (Why Finding the Right Solution is Sometimes Hard?) I rant about my beliefs pertaining to the question in the title.

## The Problem

Consider a tree with edges (or nodes) annotated with probability weights. The problem I am going to solve is simple: calculate the path likelihoods for each of the *leaves*. Basically multiply all the probabilities down each branch all the way to the leaf.

This is obviously not so hard, and the biggest challenge was to figure out that this was what the requirements needed.

The solution seems like a good exercise and I am sharing my solutions here.

All the goodies (`a`

) are at the leaves, in addition, each node is annotated with some probability `p`

and a label `String`

:

```
import Data.Functor.Foldable.TH
data ProbTree p a =
Leaf p String a
| Branches p String [ProbTree p a]
deriving (Eq, Show, Functor, Foldable, Traversable)
'ProbTree makeBaseFunctor '
```

The goal is implement a function that transforms `ProbTree NodeProb a`

into `ProbTree CumulativeProb a`

.

I will be using *recursion-schemes* package and `makeBaseFunctor`

generates the code I will need to fold or unfold the tree. All the instances I have declared, `Functor, Foldable, Traversable`

, are intended for the consumption of the goodies `a`

. I will not care much about `a`

-s in this post. I will care about the `p`

-s.

This type is really a `Bifunctor`

(Functor in both `p`

and `a`

), and most of the code I am going to show could just use that. Instead of implementing `Bifunctor`

instance I will use:

```
import qualified Data.List as L
probMap :: (p1 -> p2) -> ProbTree p1 a -> ProbTree p2 a
Leaf p l a) = Leaf (fn p) l a
probMap fn (Branches p l xs) = Branches (fn p) l $ L.map (probMap fn) xs probMap fn (
```

However, for some examples I will need something stronger, I need it to be `Traversable`

in `p`

.

I will just use the *lens* package to get what I need. You do not need to know much about lenses to read on. I will try to explain the usage as I go.

Keeping `p`

as a type variable is convenient and, for example, could allow me to use it for (`Natural`

) distribution counts or other things that probabilities can morph into.

### Why is this useful?

Decision Trees are popular because of their ability to visualize outcomes of various processes.

It is sometimes useful to know the distribution of the final outcomes. Think about using *QuickCheck* to randomly generate final outcomes or, maybe, randomly pick these outcomes from some big dataset of samples.

`ProbTree p a`

would use `a`

to hold the generated values or to describe them in some way. It is convenient to keep the data samples `a`

separate from probabilities `p`

(instead of just using something like `RoseTree (p,a)`

) because we care about these samples only at the final outcome / leaf level.

The extra `String`

label is for my own convenience. It allows me to trace how the recursion schemes work and will be handy in this presentation.

`ProbTree p a`

may not be the nicest to use when implementing recursion-schemes. All the better as the choice of training!

### Plumbing

So I have a lens that will let me ‘focus’ on the `p`

-s:

```
import Control.Lens
-- | needs to be non-polymorphic because of nesting
probability :: Lens' (ProbTree p a) p
= lens get set
probability where
Leaf p _ _) = p
get (Branches p _ _) = p
get (Leaf _ l x) r = Leaf r l x
set (Branches _ l x) r = Branches r l x set (
```

It is considered a bad karma to use Record syntax for types that have several constructors. So I coded this by hand. Otherwise template Haskell `makeLenses`

could create `probability`

if both constructors had a field with the same name `_probability :: p`

. If you new to lenses, think that the above code *encapsulates* getting and setting `p`

-s.

This thing is what is called a simple lens. That means it does not allow me to change the `p`

type. I will need something more flexible than that(`makeLenses`

does not create what I need anyway).

In a dream, dependently typed, world there would be a type level guarantee that in `Branches (p1 :: p) String [ProbTree (pn :: p) a]`

we have `1 ~ sum pn`

for probability weights and `p1 ~ sum pn`

for cumulative case.

In reality, it would be nice to, at least, not use `p ~ Float`

all the time.

Since I will be changing the meaning of what `p`

is from a probability assigned to a node to calculated cumulative probability of getting to this node:

```
newtype NodeProb = NodeProb {unNodeProb :: Float} deriving Eq
newtype CumulativeProb = CumulativeProb {unCumulativeProb :: Float} deriving Eq
-- Show instances implemented, but not shown (pun not intended)
```

To deal with swapping `p`

’s I need to implement an honest to god traversal. Consider this as just a boilerplate:

```
probabilityT :: forall p r a f . Applicative f =>
-> f r)
(p -> ProbTree p a
-> f (ProbTree r a)
Leaf p l a) = Leaf <$> fn p <*> pure l <*> pure a
probabilityT fn (Branches p l xs) = Branches <$> fn p <*> pure l <*> fx xs
probabilityT fn (where fx :: [ProbTree p a] -> f [ProbTree r a]
= traverse (probabilityT fn) xs fx xs
```

Now I can implement cumulative probabilities working with `Float`

s and expose them to external code using more descriptive types (*lens* `Iso`

-s would be nicer, this seems to work better in the context of this post):

```
compWithFloats ::
ProbTree Float a -> ProbTree Float a)
(-> ProbTree NodeProb a
-> ProbTree CumulativeProb a
= over probabilityT CumulativeProb . fn . over probabilityT unNodeProb compWithFloats fn
```

### Example Tree

I am now in the position to define a simple example tree:

```
exTree :: ProbTree NodeProb ()
= over probabilityT NodeProb $ Branches 1 "R" [
exTree Branches 0.5 "1" [
Branches 0.5 "11" [
Leaf 0.5 "111" ()
Leaf 0.5 "112" ()
,
]Branches 0.5 "12" [
, Leaf 1 "121" ()
]
]Branches 0.5 "2" [
, Leaf 0.2 "21" ()
Leaf 0.4 "22" ()
, Branches 0.4 "23" [
, Leaf 0.5 "231" ()
Leaf 0.5 "232" ()
,
]
] ]
```

the lens `over`

allows me to type the Floats and `map`

these to the `NodeProb`

type to indicate what the numbers represent at the type level. I will use this example moving forward. The final result (leaves only) of what I want to compute annotated with labels should look like this:

`"(0.125,"111"),(0.125,"112"),(0.25,"121"),(0.1,"21"),(0.2,"22"),(0.1,"231"),(0.1,"232")"`

### Traditional Solution, State Threads

There is a good possibility of finding a ready solution for the problem on *stackoverflow* or even in *wikipedia*.

The code in some psedo-language that mutates variables would look something like:

```
// some C-like language
// mutate in-place all probabilities on all nodes on a tree.
calculateProbNode (weight, node) =
node.prob = weight * node.prob
children = node.children
for (child :: children) {
calculateProb (node.prob, child)
}
calcuateProb (node) = calculateProbNode (1, node)
```

Since I moved to Haskell, code like this started to scare me. It mutates things, returns nothing, and it is easy to get it wrong.

However, there are reasons (e.g. performance) when code like this is needed.

There it the misconceptions about Haskell not being able to mutate variables. In particular, variable mutation in internal implementation that leads to a referentially transparent result can be even implemented as a pure function. The tooling for this is called `ST`

monad.

This is the above code implemented in Haskell:

```
import Control.Monad.ST
import Data.STRef
type MutableFloat s = STRef s Float
computeProbMutable :: forall a . ProbTree NodeProb a -> ProbTree CumulativeProb a
= compWithFloats (
computeProbMutable -> runST $ do
\t <- makeMutable t
mutable <- recursion 1 mutable
mutableres
makeNonMutable mutableres
)where
recursion :: Float -> ProbTree (MutableFloat s) a -> ST s (ProbTree (MutableFloat s) a)
Branches mutp l xs) = do
recursion n (*) -- modify probability in place
modifySTRef mutp (n <- readSTRef mutp -- read modified probablity
n1 <- mapM (recursion n1) xs -- recursively apply modified value to children
xs1 return $ Branches mutp l xs1
@(Leaf mutp _ _) = do
recursion n x*) -- on Leaf there is not much to do, only modify the propability
modifySTRef mutp (n return x
-- | Traverse all Tree nodes and replace Float's with `MutableFloat`-s.
makeMutable :: ProbTree Float a -> ST s (ProbTree (MutableFloat s) a)
=
makeMutable @Float)
traverseOf probabilityT (newSTRef
-- | Traverse all Tree nodes and replace `MutableFloat`-s with not mutable regular `Float`s
makeNonMutable :: ProbTree (MutableFloat s) a -> ST s (ProbTree Float a)
= traverseOf probabilityT readSTRef
makeNonMutable
-- >>> tstMutable
-- "(0.125,\"111\",()),(0.125,\"112\",()),(0.25,\"121\",()),(0.1,\"21\",()),(0.2,\"22\",()),(0.1,\"231\",()),(0.1,\"232\",())"
tstMutable :: String
= printLeaves $ computeProbMutable exTree tstMutable
```

The bulk of the code is translating from not mutable Haskell to mutable variable `type MutableFloat s = STRef s Float`

. The rest is very similar to the psedo-code above.

If you are not familiar with `ST`

, the `s`

is existential and prevents mutable Floats from escaping the `ST`

Monad. You can use them inside `ST`

but not after you leave it.

What is mind-blowing to me is that Haskell introduced this in 1993! Now such existential tricks are used a lot (e.g. in dependently typed code) so the concept probably does not need much introduction.

### Simple Recursion Schemes

The above solution leaves a bad taste in my mouth. The idea I will pursue is that I can compute these values by *folding* or maybe *unfolding* the tree. This example uses the *recursion-schemes* package. I am not going to delve deep into recursion schemes. My goal is to show the code with high level explanations for how and why it works.

Great information about some of the category theoretical background can be found in Bartosz Milewski’s CTFP Book, F-Algebras secion.

There is a series of blogs devoted to recursion-schemes by Patrick Thomson that are very good reading as well An Introduction to Recursion Schemes.

**Quick intro to recursion-schemes.**

The Template Haskell instruction, `makeBaseFunctor ''ProbTree`

, we have started with generated a new type that looks like this:

`data ProbTreeF p a r = LeafF p String a | BranchesF p String [r]`

As you can see, `ProbTreeF`

is very similar to `ProbTree`

, except the original ‘recursive’ `ProbTree p a`

in `Branches p String [ProbTree p a]`

got replaced with a new type variable `r`

in `BranchesF p String [r]`

.

*recursion_schemes* uses `Base t`

type family and `ProbTreeF p a r`

is the *Base* functor for `ProbTree`

.

```
type family Base t :: * -> *
type instance Base (ProbTree p a) = ProbTreeF p a
```

and the library uses `Base`

in all of its definitions. I will use `ProbTreeF`

for clarity.

Classic recursion scheme code often starts with the base functor and derives the recursive target type by applying *Fix* from `data-fix`

. Both are isomorphic:

`Fix (ProbTreeF p a) =~ ProbTree p a`

If this is new to you it will take some time digest. (**end**)

#### Catamorphism

As a warm-up let’s fold the tree by collecting information about leaves onto a single String:

```
printLeaves :: forall p a. (Show p, Show a) => ProbTree p a -> String
= fold fn
printLeaves where
fn :: ProbTreeF p a String -> String
LeafF n lb a) = show (n, lb, a)
fn (BranchesF _ _ xs) = L.intercalate "," xs fn (
```

This code folds the tree onto a String by spelling out the `LeafF`

content and then by concatenating the strings using comma when processing `BranchesF`

.

**Back to a quick into:**

The *Base* functor for *List* (i.e. `[]`

) looks like this:

`data ListF a b = Nil | Cons a b `

folding a list would amount to:

```
printList :: (Show a) => [a] -> String
= fold fn
printList where
fn :: ListF a String -> String
Nil = "(START)"
fn Cons a str) = show a <> ":" <> str fn (
```

compare it to

`= L.foldr (\a str -> show a <> "," <> str) "(START)" printList' `

These are effectively the same!

Ability to fold arbitrary recursive data is rely very convenient. For example, JSON data served by some micro-services may lack consistency (not all micro-services are implemented in Haskell, you know)

“What if I told you it is possible to fold JSON values?”

Morpheus, The Matrix

*recursion-schemes* defines *cata* as an alternative name for *fold*. *cata* seems to be more popular and stands for *catamorphism*. (**end**)

**Continuing the warm-up:**

We can do effectful things too!

```
printIO :: forall p a. (Show p, Show a) => ProbTree p a -> IO String
= fold fn
printIO where
fn :: ProbTreeF p a (IO String) -> IO String
LeafF n lb a) = do
fn (print (n, lb, a) -- as before
return $ "done leaf " <> lb
BranchesF _ lb ixs) = do
fn (<- sequence ixs
xs print $ "Processed so far: " ++ show xs
return $ "done branch " <> lb
```

This will print content of leaves, when processing branches will print what it processed before.

It produces this output:

```
*Schemes> tstPrintIO
0.5,"111",())
(0.5,"112",())
("Processed so far: [\"done leaf 111\",\"done leaf 112\"]"
1.0,"121",())
("Processed so far: [\"done leaf 121\"]"
"Processed so far: [\"done branch 11\",\"done branch 12\"]"
0.2,"21",())
(0.4,"22",())
(0.5,"231",())
(0.5,"232",())
("Processed so far: [\"done leaf 231\",\"done leaf 232\"]"
"Processed so far: [\"done leaf 21\",\"done leaf 22\",\"done branch 23\"]"
"Processed so far: [\"done branch 1\",\"done branch 2\"]"
"done branch R"
```

#### Solution using Anamorphism

Instead of using `fold`

/ `cata`

-morphism, I will first use its opposite: `unfold`

/ `ana`

-morphism to solve the problem of computing the commutative probabilities.

We will solve the problem by unfolding the tree onto itself (I will be constructing `ProbTreeF`

).

*recusion-schemes* provides a convenient `project`

function that typically can be used to implement all the uninteresting cases when unfolding the structure onto itself:

```
-- >>> printLeaves . computeProb $ exTree
-- "(0.125,\"111\",()),(0.125,\"112\",()),(0.25,\"121\",()),(0.1,\"21\",()),(0.2,\"22\",()),(0.1,\"231\",()),(0.1,\"232\",())"
computeProb :: ProbTree NodeProb a -> ProbTree CumulativeProb a
= compWithFloats (unfold fn)
computeProb where
fn :: ProbTree Float a -> ProbTreeF Float a (ProbTree Float a)
Branches n l xs) = BranchesF n l (L.map (over probability (* n)) xs)
fn (= project x fn x
```

the only interesting case is unfolding the `Branches n l xs`

constructor. It is unfolded into `BranchesF n l xs'`

where `xs'`

is a list of children with probability modified by the current value on the branch. This line makes the computation flow towards the leaves. Thus, we can assume that the current probability value `n`

has been already computed.

Note that to test print the outcome, we have unfolded the tree in `computeProb`

and then folded it using `printLeaves`

there is a convenience function `hylo`

or `refold`

included in *recursion-schemes* package that does exactly that.

#### Back to Catamorphism.

(EDITED, previous version of this post had it wrong, thanks r/Tarmen)

In the previous section, I have unfolded the tree onto itself. That did not really unfold or construct much as the unfolding type was the tree itself. We can as well try the same trick using `cata`

or `fold`

, since we will not be destroying / folding much either.

*recusion-schemes* provides a convenient `embed`

function that typically can be used to implement all the uninteresting cases:

```
-- |
-- >>> printLeaves . computeProbBad $ exTree
-- "(0.25,\"111\",()),(0.25,\"112\",()),(0.5,\"121\",()),(0.1,\"21\",()),(0.2,\"22\",()),(0.2,\"231\",()),(0.2,\"232\",())"
computeProbBad :: ProbTree NodeProb a -> ProbTree CumulativeProb a
= compWithFloats (cata fn)
computeProbBad where
fn :: ProbTreeF Float a (ProbTree Float a) -> ProbTree Float a
BranchesF n l xs) = Branches n l (L.map (over probability (* n)) xs)
fn (= embed x fn x
```

notice there is a problem, the numbers are off. The problem is subtle. The *lens* over just modifies the current node, what we want is to apply the probability modification to the whole subtree:

```
-- >>> printLeaves . computeProbCata $ exTree
-- "(0.125,\"111\",()),(0.125,\"112\",()),(0.25,\"121\",()),(0.1,\"21\",()),(0.2,\"22\",()),(0.1,\"231\",()),(0.1,\"232\",())"
computeProbCata :: ProbTree NodeProb a -> ProbTree CumulativeProb a
= compWithFloats (cata fn)
computeProbCata where
fn :: ProbTreeF Float a (ProbTree Float a) -> ProbTree Float a
BranchesF n l xs) = Branches n l (L.map (probMap (* n)) xs)
fn (= embed x fn x
```

This works, but it is interesting to play little more with folding just to see how else I can get in trouble.

#### ST Example

**Bad attempt:**

If you looked closely at the `printIO`

example in the Catamorphism section, you may have noticed that that the fold of `Branches`

was able to do effecty things on the children. We can use the same approach to our advantage.

We will go into a mutating franzy and fold `ProbTree`

into just `MutableFloat s`

:

```
computeSTBad :: forall s a . ProbTree (ST s (MutableFloat s)) a -> ST s (MutableFloat s)
= fold fn
computeSTBad where
fn :: ProbTreeF (ST s (MutableFloat s)) a (ST s (MutableFloat s)) -> ST s (MutableFloat s)
LeafF n _ _) = n
fn (
BranchesF stmn lb ixs) = do
fn ( mn :: MutableFloat s <- stmn
n :: Float <- readSTRef mn -- read branch proability
mxs :: [MutableFloat s] <- sequence ixs -- get access to mutable probablities for direct children
mapM_ (`modifySTRef` (n *)) mxs -- mutate the children's probabilities (This even sounds wrong!)
-- return back the untouched branch probability stmn
```

Making this into a pure function requires even crazier mutations (this reuses previously defined `makeMutable`

and `makeNonMutable`

):

```
computeProbMutableBad :: forall a . ProbTree NodeProb a -> ProbTree CumulativeProb a
= compWithFloats (
computeProbMutableBad -> runST $ do
\t <- makeMutable t
mutable <- traverseOf probabilityT (pure . pure) mutable
mutable'
computeSTBad mutable'
makeNonMutable mutable )
```

the result of `computeST mutable`

is discarded, and I use `pure . pure`

(double effectful?), how crazy is that!

But, we see exactly the same issue as with our naive `cata`

attempt in the previous section:

```
-- |
-- >>> tstMutableBad
-- "(0.25,\"111\",()),(0.25,\"112\",()),(0.5,\"121\",()),(0.1,\"21\",()),(0.2,\"22\",()),(0.2,\"231\",()),(0.2,\"232\",())"
tstMutableBad :: String
= printLeaves $ computeProbMutableBad exTree tstMutableBad
```

This approach applies the correction to probabilities of the node children. We need to apply the correction all the way down the child subtree!

**So let’s fix it:**

```
computeST :: forall s a . ProbTree (ST s (MutableFloat s)) a -> ST s [MutableFloat s]
= fold fn
computeST where
fn :: ProbTreeF (ST s (MutableFloat s)) a (ST s [MutableFloat s]) -> ST s [MutableFloat s]
LeafF n _ _) = (:[]) <$> n
fn (
BranchesF stmn lb ixs) = do
fn ( mn :: MutableFloat s <- stmn
n :: Float <- readSTRef mn -- read branch proability
mxs :: [MutableFloat s] <- concat <$> sequence ixs -- get access to mutable probablities for children
mapM_ (`modifySTRef` (n *)) mxs -- mutate the subtree's probabilities
pure $ mn : mxs --return all mutable floats, including the current node
computeProbMutable :: forall a . ProbTree NodeProb a -> ProbTree CumulativeProb a
= ...
computeProbMutable
-- |
-- >>> tstMutable
-- "(0.125,\"111\",()),(0.125,\"112\",()),(0.25,\"121\",()),(0.1,\"21\",()),(0.2,\"22\",()),(0.1,\"231\",()),(0.1,\"232\",())"
tstMutable :: String
= printLeaves $ computeProbMutable exTree tstMutable
```

It works! The trick was to change to folding structure to accumulate all mutable floats for the whole subtree and keep mutating them.

The key is to find the correct structure to fold onto.

I do not like code like this type of code, but, I think, it does provide interesting imperative intuitions about the recursive fold.

#### Catamorphism. Reader Example.

But there is a nice solution that learns from the previous `ST`

example and is very close to the idea of folding the tree onto itself:

```
import Control.Monad.Reader
-- >>> printLeaves . computeProbRdr $ exTree
-- "(0.125,\"111\",()),(0.125,\"112\",()),(0.25,\"121\",()),(0.1,\"21\",()),(0.2,\"22\",()),(0.1,\"231\",()),(0.1,\"232\",())"
computeProbRdr :: ProbTree NodeProb a -> ProbTree CumulativeProb a
= compWithFloats (\t -> runReader (computePropRdr t) 1)
computeProbRdr
computePropRdr :: ProbTree Float a -> Reader Float (ProbTree Float a)
= cata fn
computePropRdr where
fn :: ProbTreeF Float a (Reader Float (ProbTree Float a)) -> Reader Float (ProbTree Float a)
BranchesF n l xs) = do
fn (<- ask
p <- mapM (local (*n)) xs
xss pure $ Branches (p * n) l xss
LeafF n l a) = do
fn (<- ask
p pure $ Leaf (p * n) l a
```

If you are not using `Reader`

a lot it may be easier to look at this, more explicit solution:

```
computeProbFn :: ProbTree Float a -> Float -> ProbTree Float a
= cata fn
computeProbFn where
fn :: ProbTreeF Float a (Float -> ProbTree Float a) -> (Float -> ProbTree Float a)
BranchesF n l xs) p = Branches (p * n) l $ L.map (\fn -> fn . ( *n ) $ p ) xs
fn (LeafF n l a) p = Leaf (p * n) l a fn (
```

What we are accumulating, is the ability to conjure a descendant tree at any point from the current probability `Float`

.

We “conjure” the tree by modifying how the children are being “conjured”: we pre-compose `(* n)`

(`n`

is current node probability) to each of the functions that “conjure” the children. This forces recomputation of the probabilities all the way down to the leaves.

This, no longer stops at depth 2.

#### Summary

I have shown a few ways to compute `ProbTree CumulativeProb a`

. I ended up getting in trouble a couple of times. *In fact, the original version of this post had bugs in it!* Recursion is dangerous, recursion has gotcha’s, it has to! Recursion is power with a price tag. Types is not very helpful when we recurse:

```
myUndefined :: a
= myUndefined myUndefined
```

It is easy to get in trouble!

## Why Finding the Right Solution is Sometimes Hard?

**Rant Alert.** This is about the cases where we know or knew how to do thing well but we get stuck anyway. Obviously, finding better and even better ways for writing code is a subject of a never ending research, I am not going to discuss that.

Haskell is an amazingly efficient tool for writing new code when our fingertips know what to do.

Sometimes they do not, and then it is not much better than other languages. It lets us brute-force the code to satisfy the requirements. It is Turing complete, meaning that you can write as much nonsense as you want.

What bit me is:

- Short amount of time (few hours)
- Difficulty in identifying a nice abstraction to solve the problem
- The abstractions I ended up wanting to use were not in my short memory

So instead I went pushing a brute-force solution…

This experience put me in a philosophical mood of trying to pin-point the reasons why sometimes the code just writes itself and sometimes we get stuck.

### Human condition

The underlying reason for why I do something wrong is typically me.

We are not perfect, probably nothing shows it better than the software we write.

“The universal aptitude for ineptitude makes any human accomplishment an incredible miracle”

Stapp’s Law

Stapp and Murphy (his coworker) laws are not about giving up, these statements are about understanding our limitations and acting on this understanding.

### Time Pressure

A hard deadline of a few hours does not (typically) happen in software engineering when writing a new code. It does for some data scientist. It also happens in competitive programming. I am not a competitive programmer and I believe very few programmers are.

Software engineers have to (sometimes) work with a short hard timeline when troubleshooting critical application errors. I believe, seasoned developers get years of conditioning that puts them into a troubleshooting mindset when under stress. This is not a creative mindset effective in designing a new code.

For code that is not a cookie-cutter, I believe I write the best code at night, when I sleep. Sometimes I need a few nights. It took two nights (no involvement otherwise) and then < 1 hour of coding for the core examples presented here. It is not just about the time spend on the task, it is often about the latency.

Time pressure can be very destructive. It can reduce both the development speed and the code quality.

That applies to longer time scales as well. Software industry, in my experience, is focused on short term goals. Quality is a long term goal.

No project can afford to have no time constraints, so a balance needs to be struck. I believe the trade-offs are not well understood by the decision makers. … But this is well beyond what this post is about.

‘Uncle Bob’ (Robert C. Martin) likes to talk about a process in which you are not done when you finish writing the code and it works. It is the sweat time for making the code nice.

### Large Bag of Tricks

In this analogy a programmer is a magician with a bag of tricks. The bigger the bag the better the programmer. It takes more effort to sort through a bigger bag. There are some interesting indexers in that solution space allowing really good programmers to find a trick faster. One such indexer is the Category Theory. In general, any conceptual understanding of something is a record in some imaginary index.

The index record does not contain all the details, it is just a pointer, it takes time to load the actual details into short memory. That could involve looking up some stuff in the code or re-reading an article… Developer is a magician that needs time to prepare before he pulls out an **old** trick.

### Not exactly Muscle Memory

*Muscle Memory* is a sports term. You train muscle memory so the movements happen on an auto-pilot.

Muscle memory tends to be a long term memory (“it is like riding a bike”), it is also hard to change. This resistance to change could manifest itself as ‘bad muscle habits’ (e.g. a ski instructor skiing with a slight wedge, static body when making a shot in tennis…). Muscle memory is a long term memory, however athletes still require a warm-up lap / game / run…

For a long time, I have observed something very similar in programmers. Programming work is surprisingly repetitive, we develop something that resembles ‘muscle memory’ without even realizing it. We can do things much faster when we work on familiar problems. We are expert cookie-cutters.

Programmers tend to repeat the previously used patterns, even if these patterns are inferior to other methods (“bad muscle habits”).

Mainstream resistance to FP can be viewed as an example of resistance to change the “muscle memory”.

The analogy is not perfect. We do not need to do a warm-up typing when we start work, instead, we do need longer warm-up period when switching to a programming area we neglected for months or years. For me, that applies especially to the ability to translate ‘real world requirements’ to code design. I will just not think in terms that are not in my short memory, especially when under time pressure.

### Cookie cutters

Cookie cutting is not just about repeating ingrained habits. I believe there is some project to project, program to program inertia which makes us want to repeat the same approach. This is not necessarily a bad thing if we are aware of it and if we are not afraid to explore other options.

Even looking at the job interview process, notice the focus on finding someone with a very closely fitting background.

Have you ever seen a job posting asking for someone with a significantly different skill set from what the group already has? Hiring managers want to find cookie-cutters that can make cookies of a very similar shape that their team bakes.

I think this is not optimal in long term. This bias creates groups with a narrow pool of experience and expertise.

I have to consider myself a cookie cutter too. The choices I made in the code I design are be biased by my other work.

### Summary

We can’t change who we are, we are unlikely to change the industry.

Instead, we can try to train “good movements” into our “muscle memory” and grow our bag of tricks.

“In short, I suggest that the programmer should continue to understand what he is doing, that his growing product remains firmly within his intellectual grip.”

Edsger W. Dijkstra

My conclusion is this: it does not really matter if I get stuck and can’t figure it out for a few days. For some problems I will be stuck for much longer.

I am talking about pursuit of a well designed code, not finding a brute-force kludge.

What really matters is not giving up and eventually figuring it out.

“You must understand, young Hobbit, it takes a long time to say anything in Old Entish. And we never say anything unless it is worth taking a long time to say.”

J.R.R Tolkien and Treebeard

The original version of this post had bugs.

Yeah, what really matters is not giving up and eventually figuring it out;) I this post I ate my own dog food.