As I write this in the early part of the twenty-first century, the tech industry is in a love affair with data. Instead of making product decisions based on hunches or wishful thinking, we use A/B testing to gather data on the alternatives and we choose according to what the data shows.

The dark side of the data obsession is an insistence that *every*
decision be backed up by data: you can’t make a move unless you
*prove* that it’s worthwhile. Or so the thinking goes.

There are lots of pitfalls with this thinking. One in particular is that, sometimes, you don’t need data to prove your point. You can do better: you can use math.

Math, for example, tells us that the sum of the length of any two legs of a triangle is greater than the length of the other leg. It would be extremely dumb to demand that someone go out and measure a bunch of triangles to “prove” or “disprove” this. If we take a measurement that violates the Triangle Inequality, it can only mean that we have a broken ruler, poor measurement skills, or a busted “triangle” with crooked sides.

And this brings me to the subject of safe and unsafe programming languages, which I can’t stop writing about (1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13).

Many people don’t realize that the word “safe” is not just wishful
thinking. “Safe” is a technical term with a *mathematical* meaning.
The behavior of these languages can be written down mathematically and
it can be *proved* that their behavior **does not include memory
errors**.

Some people won’t believe this until they see the data. They want a comprehensive review of all the literature, vulnerability reports, etc., something which would require an immense effort. If the point is to show that safe languages do not have memory errors, that would a waste. Math has already told us that.

This is not to say that data is useless. As with the Triangle
Inequality, we may find a data point that *seems* to show that a safe
language has a memory error—but math tells us that this would
inevitably show a completely different problem. Namely, the safe
language compiles to an unsafe language, or the safe language is
interpreted by a program written in an unsafe language, or the safe
language is calling a library written in an unsafe language. In other
words, the data can help us find bugs in safe language
implementations.

If safe language *implementations* can have memory errors, does this
make their mathematical guarantee worthless? No, for two reasons.

First, fixing a bug in a safe language implementation makes *all
programs written in that language safer*. This is the same reason it
makes sense to use a well-tested and widely-used library for a
critical task rather than roll your own code.

Second, remember that we have been working on safe languages for more
than 40 years, and we know a *lot* about how to make their
implementations safe. In particular, and we know how to create safe
*machine languages*. That is, we can compile a safe high level
language into a safe machine language that can run directly on the
hardware. Turtles all the way down, as they say. Or better yet:
*math* all the way down.