RPATH, or why lld doesn’t work on NixOS
I’ve learned a thing I wish I didn’t know. As a revenge, I am going to write it down so that you, my dear reader, also learn about this. You probably want to skip this post unless you are interested and somewhat experienced in all of Rust, NixOS, and dynamic linking.
Problem
I use NixOS and Rust. For linking my Rust code, I would love to use lld, the LLVM linker, as it is significantly faster. Unfortunately, this often leads to errors when trying to run the resulting binary:
Let’s see what’s going on here!
Baseline
We’ll be using evdev-rs
as a running example.
It is binding to the evdev shared library on Linux.
First, we’ll build it with the default linker, which just works (haha, nope, this is NixOS).
Let’s get the crate:
And run the example
This of course doesn’t just work and spits out humongous error message, which contains one line of important information: we are missing libevdev
library.
As this is NixOS, we are not going to barbarically install it globally.
Let’s create an isolated environment instead, using nix-shell
:
And activate it:
This environment gives us two things — the pkg-config
binary and the evdev
library.
pkg-config
is a sort of half of a C package manager for UNIX: it can’t install libraries, but it helps to locate them.
Let’s ask it about libevdev
:
Essentially, it resolved library’s short name (libevdev
) to the full path to the directory were the library resides:
The libevdev.so.2.3.0
file is the actual dynamic library.
The symlinks stuff is another bit of a C package manager which implements somewhat-semver: libevdev.so.2
version requirement gets resolved to libevdev.so.2.3.0
version.
Anyway, this works well enough to allow us to finally run the example
Success!
Ooook, so let’s now do what we wanted to from the beginning and configure cargo to use lld
, for blazingly fast linking.
lld
The magic spell you need need to put into .cargo/config
is (courtesy of @lnicola):
To unpack this:
-
-C
set codegen optionlink-arg=-fuse-ld=lld
. -
link-arg
means thatrustc
will pass “-fuse-ld=lld” to the linker. -
Because linkers are not in the least confusing, the “linker” here is actually the whole gcc/clang.
That is, rather than invoking the linker,
rustc
will callcc
and that will then call the linker. -
So
-fuse-ld
(unlike-C
, I think this is an atomic option, not-f use-ld
) is an argument to gcc/clang, which asks it to uselld
linker. -
And note that it’s
lld
rather thanldd
which confusingly exists and does something completely different.
Anyhow, the end result is that we switch the linker from ld
(default slow GNU linker) to lld
(fast LLVM linker).
And that breaks!
Building the code still works fine:
But running the binary fails:
rpath
Ok, what’s now?
Now, let’s understand why the first example, with ld
rather than lld
, can’t work :-)
As a reminder, we use NixOS, so there’s no global folder a-la /usr/lib
where all shared libraries are stored.
Coming back to our pkgconfig
example,
the libevdev.so
is well-hidden behind the hash.
So we need a pkg-config
binary at compile time to get from libevdev
name to actual location.
However, as this is a dynamic library, we need it not only during compilation, but during runtime as well.
And at runtime loader (also known as dynamic linker (its binary name is something like ld-linux-x86-64.so
, but despite the .so
suffix, it’s an executable (I kid you not, this stuff is indeed this confusing))) loads the executable together with shared libraries required by it.
Normally, the loader looks for libraries in well-known locations, like the aforementioned /usr/lib
or LD_LIBRARY_PATH
.
So we need something which would tell the loader that libevdev
lives at /nix/store/$HASH/lib
.
That something is rpath (also known as RUNPATH) — this is more or less LD_LIBRARY_PATH
, just hard-coded into the executable.
We can use readelf
to inspect program’s rpath.
When the binary is linked with the default linker, the result is as follows (lightly edited for clarity):
And sure, we see path to libevdev
right there!
With rustflags = ["-Clink-arg=-fuse-ld=lld"]
, the result is different, the library is missing from rpath:
At this point, I think we know what’s going on. To recap:
-
With both
ld
andlld
, we don’t have problems at compile time, becausepkg-config
helps the compiler to find the library. -
At runtime, the library linked with
lld
fails to find the shared library, while the one linked withld
works. -
The difference between the two binaries is the value of rpath in the binary itself.
ld
somehow manages to include rpath which contains path to the library. This rpath is what allows the loader to locate the library at runtime.
Curious observation: dynamic linking on NixOS is not entirely dynamic.
Because executables expect to find shared libraries in specific locations marked with hashes of the libraries themselves, it’s not possible to just upgrade .so
on disk for all the binaries to pick it up.
Who sets rpath?
At this point, we have only one question left:
Why?
Why do we have that magical rpath thing in one of the binaries.
The answer is simple — to set rpath, one passes -rpath /nix/store/...
flag to the linker at compile time.
The linker then just embeds the specified string as rpath field in the executable, without really inspecting it in any way.
And here comes the magical/hacky bit — the thing that adds that -rpath
argument to the linker’s command line is the NixOS wrapper script!
That is, the ld
on NixOS is not a proper ld, but rather a shell script which does a bit of extra fudging here and there, including the rpath:
There’s a lot of going on in that wrapper script, but the relevant thing to us, as far as I understand, is that everything that gets passed as -L
at compile time gets embedded into the binary’s rpath, so that it can be used at runtime as well.
Now, let’s take a look at lld
’s wrapper:
Haha, nope, there’s no wrapper!
Unlike ld
, lld
on NixOS is an honest-to-Bosch binary file, and that’s why we can’t have great things!
This is tracked in issue #24744 in the nixpkgs repo :)
Update:
So….. turns out there’s more than one lld
on NixOS.
There’s pkgs.lld
, the thing I have been using in the post.
And then there’s pkgs.llvmPackages.bintools
package, which also contains lld
.
And that version is actually wrapped into an rpath-setting shell script, the same way ld
is.
That is, pkgs.lld
is the wrong lld
, the right one is pkgs.llvmPackages.bintools
.