Integration Testing Rust Binaries
Introduction
One of the things I liked about Rust when I first came to the language was the way in which testing was tightly integrated into the development process. Unit tests are colocated with the code (they're removed when compiling the final binary, of course) and the test harness is integrated into the toolchain, not a third-party tool that stands apart. By comparison, testing in C++ or Java has often felt "bolted-on". Cargo (the Rust project manager) makes a thoughtful distinction between unit and integration tests. The former has access to module-private entities (no more making things public just to expose them to the test suite) whereas the latter "sees" the code under test the way callers would.
It was only recently that I realized that the framework is really oriented toward testing Rust libraries. I'm building a web service in Rust, and "integration" testing to me involves spinning-up a datastore, starting the web service itself, and then having the integration tests act as clients to my service, a testing posture that's not really supported by the default framework. Furthermore, I want to configure both the database & the web service in different ways and run the same suite of tests against each of those different configurations; again, the standard framework makes no provision for such "fixtures": a combination of state & ambient execution environment that needs to be setup & torn down for each integration test.
The situation turned out to be far from dire; it's just a less mature aspect of the Rust ecosystem. Solving the problem, however, involved wading through some less-than-thoroughly documented corners of that ecosystem, so I thought I'd write-up my solution as a guide for anyone else who finds themselves wandering this way. I'll begin with some of the basics of testing in Rust, moving quickly because I find it hard to imagine anyone unfamiliar with that reading this post. Then I'll sketch a general approach to getting what I want, with a detour into supporting async tests within that framework. Finally, I'll talk a bit about code organization.
Background
Cargo is not only the Rust package manager, it's also the project orchestrator. We create new projects with cargo new, build them with cargo build, and test them with cargo test. It is this last aspect of cargo with which we are concerned. Rust programmers generally write unit tests alongside the code under test:
pub fn add(left: usize, right: usize) -> usize {
left + right
}
#[cfg(test)]
mod tests {
use super::*;
#[test]
fn it_works() {
let result = add(2, 2);
assert_eq!(result, 4);
}
}
This example, by the way, is from the "bible": the Rust Book, which covers unit testing Rust code in some detail. When we execute cargo test, the test harness automatically collects all our tests and executes them for us, like so:
$ cargo test
Compiling adder v0.1.0 (file:///projects/adder)
Finished `test` profile [unoptimized + debuginfo] target(s) in 0.57s
Running unittests src/lib.rs (target/debug/deps/adder-92948b65e88960b4)
running 1 test
test tests::it_works ... ok
test result: ok. 1 passed; 0 failed; 0 ignored; 0 measured; 0 filtered out; finished in 0.00s
Doc-tests adder
running 0 tests
test result: ok. 0 passed; 0 failed; 0 ignored; 0 measured; 0 filtered out; finished in 0.00s
Now, if we create a tests directory in our project, cargo test will automatically treat files therein as integration tests: each file will be compiled to a separate executable which will depend on our crate (and can only see public functions & types). In each integration test, we can mark our test functions with #[test] (just like in our unit tests) and cargo test will automatically collect them for us.
Finally, the general format of the cargo test command is:
cargo test [options] [testname] [-- test-options]
That is, some options are passed to the overall test harness itself and others to the individual tests.
The Problem
For my project, to run an integration test, I need to spin-up the datastore, start the web service (the actual code under test), and then act as a client of said web service. Furthermore, I want to be able to execute the same test logic against different configurations of both the datastore & the service. Put another way, the tests needed to be independent of the test fixture (if we consider the fixture to be a particular configuration of datastore plus service). The Rust book mentions ways of getting common setup code into your integration tests, but not only does that not leave the test logic independent of the test fixture, but there's no way to ensure that tear-down occurs, should an assert!() fire in your test.
The Nextest project looks interesting, but 1) is implemented as a Cargo plugin (i.e. to use it one needs to say cargo nextest rather than cargo test and 2) still provides nothing in terms of test fixtures.
The Solution
What we need to do is get our fingers into the logic between the start of main() in each integration test and the actual tests themselves. The only way to do this I could find, per Advanced Rust Testing, is a custom test harness. By registering our integration tests with Cargo, and opting-out of the default test harness:
[[test]] name = "integration" # 👇 enabled by default harness = false
we're signing-up to write for ourselves the logic we need to customize. After we add that to our Cargo.toml, Cargo now expects to find a file named integration.rs in the tests directory containing a main(). cargo test will compile it and execute it, passing any command-line parameters passed to cargo test after the --. Our implementation is expected to exit with status zero if all tests passed, and one else. That's it– that's the contract.
It's not a great point of extension for the system. Yes, it gives us the freedom to do what we want, but it requires us to take on responsibility for things we don't want to customize. To wit: parsing the per-test command-line arguments. While a compliant implementation could simply ignore its command-line arguments, that would be quite surprising to one's users. That said, I count sixteen options accepted by libtest: that's a lot of work, especially when I just want to replicate what libtest already does (!) Enter libtest-mimic: "This is a simple and small test harness that mimics the original libtest… all output looks pretty much like cargo test and most CLI arguments are understood and used. With that plumbing work out of the way, your test runner can focus on the actual testing." Perfect.
Alright, so having registered our test as above, we can now write our own main() like so:
// integration.rs
fn main() -> Result<()> {
let args = libtest_mimic::Arguments::from_args();
// Test fixture setup to go here...
let conclusion = libtest_mimic::run(&args, todo!());
// Test fixture teardown to go here...
conclusion.exit();
}
I won't dig into the fixture setup & teardown in this post, since it's specific to my project and unlikely to be of general interest. But what about that second parameter to run()? How do we tell libtest-mimic about the test suite? The signature for this function is:
pub fn run(args: &Arguments, tests: Vec<Trial>) -> Conclusion
so we need to assemble a vector of Trial. Trial instances can only be constructed (for our purposes) via the constructor Trial::test()
pub fn test<R>(name: impl Into<String>, runner: R) -> Self
where R: FnOnce() -> Result<(), Failed> + Send + 'static
OK: so for each test, we need a textual name (or something that can be converted into text), and a thing that implements FnOnce() (well, plus the Send marker trait & has static lifetime). The question is: how do we assemble that list?
The obvious move is to just maintain the list by hand in main(), somewhere above the call to run(). As T.J. Telan notes in his post on this topic, however, that becomes tedious (and error-prone) quickly. I suppose one could write a macro to serve the same purpose as the #[test] attribute macro (which I guess is a compiler built-in?). libtest-mimic has an example of an implementation that will walk the project source tree looking for files on which to perform a given "tidying" test. One could, I suppose, look for distinguishing text patterns in the integration tests directory.
Still, both T.J. & I wound-up using David Tolnay's inventory crate: "This crate provides a way to set up a plugin registry into which plugins can be registered from any source file linked into your application. There does not need to be a central list of all the plugins." One defines a class of "plugin", or tests in our case, by defining a struct to represent all possible instances of tests, then "registering" our type with the inventory crate:
pub struct Test {
pub name: &'static str,
pub test_fn: fn() -> Result<(), Failed>, // `Failed` is provided by libtest-mimic
}
inventory::collect!(Test);
Then, wherever is convenient, we can instantiate a test and register it:
// `my_cool_test` is a test function:
// fn my_cool_test() -> Result<(), Failed>
inventory::submit! (
Test {
name: "my-cool-test",
test_fn: my_cool_test,
}
)
Finally, when we need to iterate over the list of all registered tests, we can say:
inventory::iter::<Test>
.into_iter()
.map(|test| {
Trial::test(
test.name,
test.test_fn,
)
})
.collect()
The Configuration Problem
At this point, the main in my integration tests was looking like this:
// integration.rs
async fn main() -> Result<()> {
let args = libtest_mimic::Arguments::from_args();
// Test fixture setup to go here...
let conclusion = libtest_mimic::run(
&args,
inventory::iter::<Test>
.into_iter()
.map(|test| {
Trial::test(
test.name,
test.test_fn,
)
})
.collect()
);
// Test fixture teardown to go here...
conclusion.exit();
}
That's great, so long as our tests take no parameters. In my case, I needed to pass a variety of test-specific parameters, in all cases derived from a configuration file. I changed my Test type to take the configuration…
pub struct Test {
pub name: &'static str,
pub test_fn: fn(Configuration) -> Result<(), Failed>, // `Failed` is provided by libtest-mimic
}
and with the use of a lambda at the point of test registration, picked-out just the pieces of the the configuration required by each test:
inventory::submit!(Test {
name: "test_healthcheck",
test_fn: |cfg: Configuration| { test_healthcheck(cfg.url) },
});
main() now looks like this:
// integration.rs
async fn main() -> Result<()> {
let config = Configuration::new()?;
let args = libtest_mimic::Arguments::from_args();
// Test fixture setup to go here...
let conclusion = libtest_mimic::run(
&args,
inventory::iter::<Test>
.into_iter()
.map(|test| {
Trial::test(
test.name,
{
let cfg = config.clone();
move || { (test.test_fn)(cfg) }
}
)
})
.collect()
);
// Test fixture teardown to go here...
conclusion.exit();
}
The async Problem
So far, so good. Next wrinkle: my tests are all async while libtest-mimic expects synchronous tests. This was more challenging.
In this particular case, we can make this work by passing into libtest-mimic a lamba that wraps our test invocation with a call to the block_on() method of our runtime… it's just that simply decorating main() with #[tokio::main] won't work, because it leaves us with no way to get our hands on that runtime. The solution is to construct the runtime ourselves, leaving us with a main() that looks like this:
// integration.rs
async fn main() -> Result<()> {
let rt = Arc::new(Runtime::new().expect("Failed to build a tokio multi-threaded runtime"));
let config = Configuration::new()?;
let args = libtest_mimic::Arguments::from_args();
// Test fixture setup to go here...
let conclusion = libtest_mimic::run(
&args,
inventory::iter::<Test>
.into_iter()
.map(|test| {
Trial::test(
test.name,
{
let rt = rt.clone();
let cfg = config.clone();
move || rt.block_on( async { (test.test_fn)(cfg).await })
}
)
})
.collect()
);
// Test fixture teardown to go here...
conclusion.exit();
}
Also, our test defintion will have to change to:
pub struct Test {
pub name: &'static str,
pub test_fn: fn(Configuration) -> futures::future::BoxFuture<'static, Result<(), Failed>>,
}
and our test registrations to:
inventory::submit!(Test {
name: "test_healthcheck",
test_fn: |cfg: Configuration| { Box::pin(test_healthcheck(cfg.url)) },
});
Code Arrangement
Finally: how is all this arranged in my project. I chose not to add these to the integration tests in my web service's crate– tests there are intended to test the crate's library package from the perspective of a consumer of that crate (i.e. no access to private module members). I instead took the approach that tokio (and others) uses– stand-up a sibling "test crate" inside a workspace:
+ workspace root
|
+-- Cargo.lock
|
+-- my project crate
| |
| +-- Cargo.toml
| |
| +-- src: web service source code
| |
| +-- tests: integration tests for my web service's
| library package
|
+-- my test crate: integration tests for the web service crate
|
+-- Cargo.toml: lists the project crate as a dependency to
| ensure it's built before any tests are run; also all
| integration tests are registered here
|
+-- src: test logic that applies to multiple fixtures
| and any supporting code
|
+-- tests: integration tests, one test per fixture
Each file in the top-level tests directory in the test crate is an integration test that gets registered in the test crate's Cargo.toml and carries out tests against a given test fixture, setting-up & tearing-down that fixture on start-up & termination (regardless of success or failure).
Conclusion
The result of all of this has been an integration test suite for my web service that supports fixtures, and that makes it easy to add new tests that apply to one, some, or all fixtures with minimal boilerplate.
I'd like to particularly thank T.J. Telan for his post How to Build a Custom Integration Test Harness in Rust, which proved invaluable for me in figuring this all out.