Delete Cargo Integration Tests

Click bait title! Well actually look into how integration and unit tests are implemented in Cargo. A few guidelines for organizing test suites in large Cargo projects naturally arise out of these implementation differences. And, yes, one of those guidelines will turn out to be: delete all integration tests but one.

Keep in mind that this post is explicitly only about Cargo concepts. It doesnt discuss relative merits of integration or unit styles of testing. Id love to, but thats going to be a loooong article some other day!

Loomings 🐳

When you use Cargo, you can put #[test] functions directly next to code, in files inside src/ directory. Alternatively, you can put them into dedicated files inside tests/:

awesomeness-rs/
  Cargo.toml
  src/          # unit tests go here
    lib.rs
    submodule.rs
    submodule/
      tests.rs

  tests/        # integration tests go here
    is_awesome.rs

I stress that unit/integration terminology is based purely on the location of the #[test] functions, and not on what those functions actually do.

To build unit tests, Cargo runs

rustc --test src/lib.rs

Rustc then compiles the library with --cfg test. It also injects a generated fn main(), which invokes all functions annotated with #[test]. The result is an executable file which, when run subsequently by Cargo, executes the tests.

Integration tests are build differently. First, Cargo uses rustc to compile the library as usual, without --cfg test:

rustc --crate-type=rlib src/lib.rs

This produces an .rlib file a compiled library.

Then, for each file in the tests directory, Cargo runs the equivalent of

rustc --test --extern awesomeness=path/to/awesomeness.rlib \
    ./tests/is_awesome.rs

That is, each integration test is compiled into a separate binary. Running those binaries executes the test functions.

Implications

Note that rustc needs to repeatedly re-link the library crate with each of the integration tests. This can add up to a significant compilation time blow up for tests. That is why I recommend that large projects should have only one integration test crate with several modules. That is, dont do this:

tests/
  foo.rs
  bar.rs

Do this instead:

tests/
  integration/
    main.rs
    foo.rs
    bar.rs

When a refactoring along these lines was applied to Cargo itself, the effects were substantial (numbers). The time to compile the test suite decreased 3x. The size of on-disk artifacts decreased 5x.

It cant get better than this, right? Wrong! Rust tests by default are run in parallel. The main that is generated by rustc spawns several threads to saturate all of the CPU cores. However, Cargo itself runs test binaries sequentially. This makes sense otherwise, concurrently executing test binaries oversubscribe the CPU. But this means that multiple integration tests leave performance on the table. The critical path is the sum of longest tests in each binary. The more binaries, the longer the path. For one of my projects, consolidating several integration tests into one reduced the time to run the test suite from 20 seconds to just 13.

A nice side-effect of a single modularized integration test is that sharing the code between separate tests becomes trivial, you just pull it into a submodule. Theres no need to awkwardly repeat mod common; for each integration test.

Rules of Thumb

If the project I am working with is small, I dont worry about test organization. Theres no need to make tests twice as fast if they are already nearly instant.

Conversely, if the project is large (a workspace with many crates) I worry about test organization a lot. Slow tests are a boiling frog kind of problem. If you do not proactively fix it, everything is fine up until the moment you realize you need to sink a week to untangle the mess.

For a library with a public API which is published to crates.io, I avoid unit tests. Instead, I use a single integration tests, called it (integration test):

tests/
  it.rs

# Or, for larger crates

tests/
  it/
    main.rs
    foo.rs
    bar.rs

Integration tests use the library as an external crate. This forces the usage of the same public API that consumers use, resulting in a better design feedback.

For an internal library, I avoid integration tests all together. Instead, I use Cargo unit tests for integration bits:

src/
  lib.rs
  tests.rs
  tests/
    foo.rs
    bar.rs

That way, I avoid linking the separate integration tests binary altogether. I also have access to non-pub API of the crate, which is often useful.

Assorted Tricks

First, documentation tests are extremely slow. Each doc test is linked as a separate binary. For this reason, avoid doc tests in internal libraries for big projects and add this to Cargo.toml:

[lib]
doctest = false

Second, prefer

#[cfg(test)]
mod tests; // tests in `tests.rs` file

to

#[cfg(test)]
mod tests {
    // tests here
}

This way, when you modify just the tests, the cargo is smart to not recompile the library crate. It knows that the contents of tests.rs only affects compilation when --test is passed to rustc. Learned this one from @petrochenkov, thanks!

Third, even if you stick to unit tests, the library is recompiled twice: once with, and once without --test. For this reason, folks from pernosco go even further. They add

[lib]
test = false

to Cargo.toml, make all APIs they want to unit test public and have a single test crate for the whole workspace. This crate links everything and contains all the unit tests.

Discussion on /r/rust.