Code Smell: Concrete Abstraction
This is a hand-wavy philosophical article about programming, without quantifiable justification, but with some actionable advice and a case study.
Suppose that there are two types in the program,
Suppose also that they both can
Does it make sense to add the following trait?
I claim that it makes sense only if you have a function like
That is, if some part of you program is generic over
If in every
x is either
Gonk, but never a
T (each usage is concrete), you don’t need this abstraction.
“Need” is used in a literal sense here: replace a trait with two inherent methods named
blag, and the code will be essentially the same.
Using a trait here doesn’t achieve any semantic compression.
Given that abstractions have costs “don’t need” can be strengthen to “probably shouldn’t”.
Not going for an abstraction often allows a for more specific interface.
A monad in Haskell is a thing with
Which isn’t telling much.
Languages like Rust and OCaml can’t express a general monad, but they still have concrete monads.
>>= is called
and_then for futures and
flat_map for lists.
These names are more specific than
>>= and are easier to understand.
>>= is only required if you want to write code generic over type of monad itself, which happens rarely.
Another example of abstraction which is used mostly concretely are collection hierarchies.
In Java or Scala, there’s a whole type hierarchy for things which can hold other things.
Rust’s type system can’t express
Collection trait, so we have to get by with using
And it isn’t actually a problem in practice.
Turns out, writing code which is generic over collections (and not just over iterators) is not that useful.
The “but I can change the collection type later” argument also seems overrated — often, there’s only single collection type that makes sense.
BTreeSet is mostly just a change at the definition site, as the two happen to have almost identical interface anyway.
The only case where I miss Java collections is when I return
Vec<T>, but mean a generic unordered collection.
In Java, the difference is captured by
In Rust, there’s nothing built-in for this.
It is possible to define a
VecSet<T>(Vec<T>), but doesn’t seem worth the effort.
Collections also suffer from
>>= problem — collapsing similar synonyms under a single name.
poll methods, because it needs to be a collection, but also is a special kind of collection.
In C++, you have to spell
vector’s push operation, so that it duck-types with
Finally, the promised case study!
rust-analyzer needs to convert a bunch of internal type to types suitable for converting them into JSON message of the Language Server Protocol.
ra::Completion is converted into
ra::TextRange which is converted to
The first implementation started with an abstraction for conversion:
This abstraction doesn’t work for all cases — sometimes the conversion requires additional context.
For example, to convert a rust-analyzer’s offset (a position of byte in the file) to an LSP position (
(line, column) pair), a table with positions of newlines is needed.
This is easy to handle:
Naturally, there was an intricate web of delegating impls. The typical one looked like this:
There were a couple of genuinely generic impls for converting iterators of convertible things.
The code was hard to understand.
It also was hard to use: if calling
.conv didn’t work immediately, it took a lot of time to find which specific impl didn’t apply.
Finally, there were many accidental (as in “accidental complexity”) changes to the shape of code:
CTX being passed by value or by reference, switching between generic parameters and associated types, etc.
I was really annoyed by how this conceptually simple pure boilerplate operation got expressed as clever and fancy abstraction. Crucially, almost all of the usages of the abstraction (besides those couple of iterator impls) were concrete. So I replaced the whole edifice with much simpler code, a bunch of functions:
Simplicity and ease of use went up tremendously.
Now instead of typing
x.conv() and trying to figure out why an impl I think should apply doesn’t apply, I just auto-complete
to_proto::range and let the compiler tell me exactly which types don’t line up.
I’ve lost fancy iterator impls, but the
for the commit was
There was some genuine code re-use in those impls, but it was not justified by the overall compression, even disregarding additional complexity tax.
To sum up, “is this abstraction used exclusively concretely?” is a meaningful question about the overall shape of code. If the answer is “Yes!”, then the abstraction can be replaced by a number of equivalent non-abstract implementations. As the latter tend to be simpler, shorter, and more direct, “Concrete Abstraction” can be considered a code smell. As usual though, any abstract programming advice can be applied only in a concrete context — don’t blindly replace abstractions with concretions, check if provided justifications work for your particular case!
Discussion on /r/rust.