The Fundamental Law Of Software Dependencies

Canonical source code for software should include checksums of the content of all its dependencies.

Several examples of the law:

Software obviously depends on its source code. The law says that something should hold the hash of the entire source, and thus mandates the use of a content-addressed version control system such as git.

Software often depends on 3rd party libraries. These libraries could in turn depend on other libraries. It is imperative to include a lockfile that covers this entire set and comes with checksums. Curiously, the lockfile itself is a part of source code, and gets mixed into the VCS root hash.

Software needs a compiler. The hash of the required compiler should be included in the lockfile. Typically, this is not done only the version is specified. I think that is a mistake. Specifying a version and a hash is not much more trouble than just the version, but that gives you a superpower you no longer need to trust the party that distributes your compiler. You could take a shady blob of bytes youve found laying on the street, as long as its checksum checks out.

Note that you can compress hashes by mixing them. For compiler use-case, theres a separate hash per platform, because the Linux and the Windows versions of the compiler differ. This doesnt mean that your project should include one compilers hash per platform, one hash is enough. Compiler distribution should include a manifest a small text file which lists all platform and their platform specific hashes. The single hash of that file is what is to be included by downstream consumers. To verify a specific binary, the consumer first downloads a manifest, checks that it has the correct hash, and then extracts the hash for the specific platform.


The law is an instrumental goal. By itself, hashes are not that useful. But to get to the point where you actually know the hashes requires:

These things are what actually make developing software easier.