Code Smell: Concrete Abstraction

This is a hand-wavy philosophical article about programming, without quantifiable justification, but with some actionable advice and a case study.

Suppose that there are two types in the program, Blorb and Gonk. Suppose also that they both can blag.

Does it make sense to add the following trait?

trait Blag {
    fn blag(&mut self);

I claim that it makes sense only if you have a function like

fn blagyify<T: Blag>(x: T) {

That is, if some part of you program is generic over T: Blag.

If in every x.blag() the x is either Blorg, or Gonk, but never a T (each usage is concrete), you dont need this abstraction.Need is used in a literal sense here: replace a trait with two inherent methods named blag, and the code will be essentially the same. Using a trait here doesnt achieve any semantic compression.

Given that abstractions have costs dont need can be strengthen to probably shouldnt.

Not going for an abstraction often allows a for more specific interface. A monad in Haskell is a thing with >>=. Which isnt telling much. Languages like Rust and OCaml cant express a general monad, but they still have concrete monads. The >>= is called and_then for futures and flat_map for lists. These names are more specific than >>= and are easier to understand. The >>= is only required if you want to write code generic over type of monad itself, which happens rarely.

Another example of abstraction which is used mostly concretely are collection hierarchies. In Java or Scala, theres a whole type hierarchy for things which can hold other things. Rusts type system cant express Collection trait, so we have to get by with using Vec, HashSet and BTreeSet directly. And it isnt actually a problem in practice. Turns out, writing code which is generic over collections (and not just over iterators) is not that useful. The but I can change the collection type later argument also seems overrated often, theres only single collection type that makes sense. Moreover, swapping HashSet for BTreeSet is mostly just a change at the definition site, as the two happen to have almost identical interface anyway. The only case where I miss Java collections is when I return Vec<T>, but mean a generic unordered collection. In Java, the difference is captured by List<T> vs Collection<T>. In Rust, theres nothing built-in for this. It is possible to define a VecSet<T>(Vec<T>), but doesnt seem worth the effort.

Collections also suffer from >>= problem collapsing similar synonyms under a single name. Javas Queue has add, offer, remove, and poll methods, because it needs to be a collection, but also is a special kind of collection. In C++, you have to spell push_back for vectors push operation, so that it duck-types with deques front and back.

Finally, the promised case study! rust-analyzer needs to convert a bunch of internal type to types suitable for converting them into JSON message of the Language Server Protocol. ra::Completion is converted into lsp::Completion; ra::Completion contains ra::TextRange which is converted to lsp::Range, etc.

The first implementation started with an abstraction for conversion:

pub trait Conv {
    type Output;
    fn conv(self) -> Self::Output;

This abstraction doesnt work for all cases sometimes the conversion requires additional context. For example, to convert a rust-analyzers offset (a position of byte in the file) to an LSP position ((line, column) pair), a table with positions of newlines is needed. This is easy to handle:

pub trait ConvWith<CTX> {
    type Output;
    fn conv_with(self, ctx: CTX) -> Self::Output;

Naturally, there was an intricate web of delegating impls. The typical one looked like this:

impl ConvWith<&LineIndex> for TextRange {
    type Output = Range;
    fn conv_with(
        line_index: &LineIndex,
    ) -> lsp_types::Range {

There were a couple of genuinely generic impls for converting iterators of convertible things.

The code was hard to understand. It also was hard to use: if calling .conv didnt work immediately, it took a lot of time to find which specific impl didnt apply. Finally, there were many accidental (as in accidental complexity) changes to the shape of code: CTX being passed by value or by reference, switching between generic parameters and associated types, etc.

I was really annoyed by how this conceptually simple pure boilerplate operation got expressed as clever and fancy abstraction. Crucially, almost all of the usages of the abstraction (besides those couple of iterator impls) were concrete. So I replaced the whole edifice with much simpler code, a bunch of functions:

fn range(
    line_index: &LineIndex,
    range: TextRange,
) -> lsp_types::Range {
    let start = position(line_index, range.start());
    let end = position(line_index, range.end());
    lsp_types::Range::new(start, end)

fn position(
    line_index: &LineIndex,
    offset: TextSize,
) -> lsp_types::Position {

Simplicity and ease of use went up tremendously. Now instead of typing x.conv() and trying to figure out why an impl I think should apply doesnt apply, I just auto-complete to_proto::range and let the compiler tell me exactly which types dont line up.

Ive lost fancy iterator impls, but the total diff for the commit was +999,-1123. There was some genuine code re-use in those impls, but it was not justified by the overall compression, even disregarding additional complexity tax.

To sum up, is this abstraction used exclusively concretely? is a meaningful question about the overall shape of code. If the answer is Yes!, then the abstraction can be replaced by a number of equivalent non-abstract implementations. As the latter tend to be simpler, shorter, and more direct, Concrete Abstraction can be considered a code smell. As usual though, any abstract programming advice can be applied only in a concrete context dont blindly replace abstractions with concretions, check if provided justifications work for your particular case!

Discussion on /r/rust.