On Ousterhout’s Dichotomy

Why are there so many programming languages? One of the driving reasons for this is that some languages tend to produce fast code, but are a bit of a pain to use (C++), while others are a breeze to write, but run somewhat slow (Python). Depending on the ratio of CPUs to programmers, one or the other might be relatively more important.

But can’t we just, like, implement a universal language that is convenient but slowish by default, but allows an expert programmer to drop to a lower, more performant but harder register? I think there were many attempts at this, and they didn’t quite work out.

The natural way to go about this is to start from the high-level side. Build a high-level featureful language with large runtime, and then provide granular opt outs of specific runtime facilities. Two great examples here are C# and D. And the most famous example of this paradigm is Python, with “rewrite slow parts in C” mantra.

It seems to me that such an approach can indeed solve the “easy to use” part of the dichotomy, but doesn’t quite work as promised for “runs fast” one. And here’s the reason. For performance, what matters is not so much the code that’s executed, but rather the layout of objects in memory. And the high-level dialect locks-in pointer-heavy GC object model! Even if you write your code in assembly, the performance ceiling will be determined by all those pointers GC needs. To actually get full “low-level” performance, you need to effectively “mirror” the data across the dialects across a quasi-FFI boundary.

And that’s what kills “write most of the code in Python, rewrite hot spots in C” approach — the overhead for transitioning between the native C data structures and the Python ones tends to eat any performance benefits that C brings to the table. There are some very real, very important exceptions, where it is possible to batch sufficiently large packages of work to minimize the overhead: http://venge.net/graydon/talks/VectorizedInterpretersTalk-2023-05-12.pdf. But it seems that the average case looks more like this: https://code.visualstudio.com/blogs/2018/03/23/text-buffer-reimplementation.

And this brings me to Rust. It feels like it accidentally blundered into the space of universal languages through the floor. There are no heavy runtime-features to opt out of in Rust. The object model is universal throughout the language. There isn’t a value-semantics/reference-semantics dichotomy, references are first-class values. And yet:

As a result, there is a certain spectrum of Rust:

While the bottom end here sits pretty comfortably next to C, the upper tip doesn’t quite reach the usability level of Python. But this is mostly compensated through these three effects: