SemVer Is Not About You

A popular genre of articles for the past few year has been a SemVer Critique, pointing out various things that are wrong with SemVer itself, or with the way SemVer is being applied, and, customary, suggesting an alternative versioning scheme. Usually, the focus is either on how SemVer ought to be used, by library authors (nitpicking the definition of a breaking change), or on how SemVer is (not) useful for a library consumer (nitpicking the definition of a breaking change).

I think these are valid lenses to study SemVer through, but not the most useful. This article suggest an alternative framing: SemVer is not about you.

Before we begin, I would like to carefully delineate the scope. Among other things, SemVer can be applied to, broadly speaking, applications and libraries. Applications are stand-alone software artifacts, usable as is. Libraries exist within the larger library ecosystem, and are building blocks for assembling applications. Libraries both depend on and are depended upon by other libraries. In the present article, we will look only at the library side.


At the first glance, it appears that SemVer solves the problem of informing the user when to do the upgrade: upgrade patch for latest bugfixes, upgrade minor if you want new features, upgrade major if you want new features and are ready to clean-up your code. But this is not the primary value of this versioning scheme. The real reason of semver is for managing transitive dependencies.

Lets say you are using some version of apples library and some version of oranges library. And suppose they both depend on the trees library. Because apples and oranges were authored at different times, they do not necessary depend on the same version of trees. There are two paths from here.

The first is to include two different versions of trees library with your app. This is unfortunate for a trivial reason of code bloat, and for a more subtle reason of interface leaking: if for some reason your code needs to pass a tree originating in apples over to the oranges, you must use exactly the same trees library.

The second path is to somehow unify transitive dependencies, and pick a single version of trees thats good for both apples and oranges. But perhaps there isnt a version that works for both?

Whos the right person to choose the appropriate course of action? It could be you, but thats unfortunate you are using libraries precisely because you want to avoid thinking too much about their internals. You dont know how apples is using trees. You could learn that, but, arguably, thats not a good tradeoff (if it is, perhaps you shouldnt depend on apples and instead maintain your own). Whats worse, for featurefull applications dependency trees run very deep, potential for conflicts scales at least linearly, and theres only a single you.

Another candidate is the author of the trees library they dont know apples and oranges directly, but they should be thinking about how their library could be used. And, because different libraries tend to have different authors, the work for resolving version conflicts get distributed across the set of people that also scales linearly!

This is the problem that SemVer solves it has nothing to do with your code or your direct dependencies, its all about dependencies of your dependencies. SemVer is library maintainer saying when two versions of their library can be unified:

Thats it! Thats the whole thing! All the talk about breaking changes is downstream of this actual behavior of version resolvers.


Notably, if you are a library maintainer, SemVer isnt about you either. When deciding between major and minor, you shouldnt be thinking about your direct dependents. They knowingly use your library, so they are capable of making informed decisions and will manage just fine. The problem are your transitive dependents. If you release a new major version, dependencies of some application up the stack could get wedged if somewhere in its dependencies tree there are both versions of your library which need interoperable types.

Or, rather, if you release a new major version, it is guaranteed that some application would have two copies of your library. Theres no such thing as atomic upgrade of dependencies across the ecosystem, propagating your new major will take time and there will be extended period where both majors are used, by different libraries, and both majors end up in applicationslockfiles. The question is rather would this be more harmful than just code bloat? If your library ends up in others public API you will likely lock some upstream applications in a variant of the following problem:

Its also worth thinking about virility of major versions if your library is someone elses public API, your major bump implies their major bump, which is of course bad because putting work on the plate of other maintainers is bad, but, whats worse, is that this virally amplifies the number of unsatisfiable graph of dependencies a-la the example above.

SemVer--

Ive seen two interesting extensions to the core SemVer. One is the observation that, to make tooling work, only two version numbers are sufficient. Theres no real difference between patch and minor, as far as the actual behavior of version resolution algorithm goes. I am sympathetic to this argument!

The second one is an observation that many projects follow the deprecate than remove cycle. Ive learned this with the release of Ember 2.0. The big deal about Ember 2.0 is that the only thing that it did was the removal of deprecation warnings. Code that didnt emit warnings on the latest Ember 1.x was compatible with 2.0.

This feels like the fundamentally right way going about the larger, more important building blocks. And you sort-of can do this with semver today, if you declare that you are compatible with "1.9, 2.0". But, even today, many years after Ember 2.0, this still feels like a cute trick. This isnt yet a pattern with a catchy name (like release trains or not rocket science rule) that everyone is using because it is an obviously good idea

And Now To Something Completely Different

Circling back to the introduction, the general pattern here is that theres a prescriptivist approach and a descriptivist one. Prescriptivist argues about the right and wrong ways to use a particular tool. Descriptivist avoids value judgement, and describes how the thing actually behaves.

Another instance of this pattern playing out Ive noticed are log levels. You can get very philosophical about the difference between error, warn and info. But what helps is looking at what they do:

  • error pages the operator immediately.
  • warn pages if it repeats frequently.
  • info is what you see in the prog logs when you actively look at them.
  • And debug is what your developers see when they enable extra logging.

давайте одевать одежду
давайте звонит говорить
а на прескриптивистов будем
ложить

avva