Sympathy and Problem-Solving

Lately, I’ve been thinking about the ideas of mechanical and computational sympathy. Mechanical sympathy is writing code and data structures with their real-world implementation (i.e. the hardware), and how they will actually be run on the CPU. Computational sympathy is thinking about the code from the standpoint of how it works, that is expressing it as a translation of thoughts. It’s a matter of the manner of expression: whether the concern is how it will be run on the hardware, or how it will run from a conceptional perspective.

The divide between writing Haskell and assembly or C code is what’s driving this. It’s illuminating to look at the concerns I have when writing code in these languages, and what my sympathies tend to be. In C, I focus on things like structure packing and how code will be translated into assembly. The end goal is to express some idea to the machine, explaining to it how the code should be run and taking into account how memory is accessed. I’m explaining my problem to the hardware and to the operating system. In languages like Haskell and Lisp, I’m expressing the problem to myself and other people. The code may not be as performant, but whether that’s an issue depends on the problem.

I was rewatching Rich Hickey’s talk "Hammock Driven Development and I’ve noticed that at work, I’ve actually been doing those (even if we program in only Go or C). I start with the question “what are we trying to accomplish?” What is the end result? I spend more time figuring out what the problem actually is, and then I ask my favourite question:

What are the characteristics of a system that solves our problem?

This isn’t just a question of what the API looks like. There’s also questions of the operational characteristics:

  • How will it be deployed?
  • How will it be backed up?
  • What does a data recovery plan for this system look like?
  • How is access to the system limited?
  • How is sensitive cryptographic key material stored, and how is it protected?

My focus is on cryptographic systems right now, so these questions may not be relevant in another system. It’s important, though, to think about not only who will use the system, but who will run it. When the hardware dies at 2 AM, having considered the recovery and operations characteristics will make life better. If the system stops working, it’s no longer solving the problem: it is a problem.

In secure systems, there’s also the consideration of tradeoffs: we can’t build a system that’s completely impervious, and also buildable and operable. Understanding the characteristics of the system means understanding what security tradeoffs we can make. We also have to balance performance. A certain part of the system may not be the fastest implementation, but is that the bottleneck compared to the cryptographic operations? Sure, concatenating a number of strings may not be as performant as working with byte arrays, but how does that performance hit fit into the picture? Is the ability to reason about these string operations more useful than the extra twenty milliseconds of performance for a given operation?

We also have to focus and make sure we’re not trying to get too fancy with the security characteristics. Sometimes, fancy things will actually be required. We have been actively avoiding doing that where it’s not necessary. My second favourite question at work is

What is this secure against?

When I’m adding security capabilities, I really strive to make sure it’s providing meaningful security. How do these security measures fit into the characteristics of the system?

Once I understand the characteristics, I spend some time thinking about the shape of the system. This usually involves drawing a module map on a whiteboard. What building blocks are needed to build the system? What are their characteristics? How do they interact? There’s a good essay around about keeping a program in your head, but most of the time you won’t be able to keep the whole program in your head. A module map helps me keep the big picture in my head, with the ability to “zoom-in” on subsets of the problem as I need while still being able to tie them back to the big picture. Unfortunately, I work in an open-plan office, so I sometimes come in on the weekend (and take a day off during the week, or leave early) to do this in peace.

At this point, I still haven’t worked out how these pieces are built: once I have an understanding of the problem, the first step is to look for off-the-shelf solutions. With the past few systems, we found a few solutions but their characteristics didn’t match ours. It’s useful during this step to take note of what was useful about these systems, and what wasn’t.

Once I have figured out the characteristics and other related systems, I start looking at how this will be implemented. What are the specific technologies (from ciphers to deployment systems) that should be used? What useful pieces can I extract (if I can’t use the whole thing) from the other solutions?

At this point, I can begin writing code. Understanding the characteristics is a useful driver for writing tests: is the system exhibit the characteristics of the system I want?