Now that we were able to build the code, we should figure out if it's working.
By the end of this section, you'll have used
bazel test to run some automated tests for the language you picked.
Bazel models tests as just a special case of running programs, where the exit code matters.
So you can think of
bazel test my_test as syntax sugar for
bazel build my_test && ./bazel-bin/my_test
The "program" being run is usually a "Test Runner" from your language ecosystem such as JUnit, pytest, mocha, etc. However it can also be a simple shell script, or any other program you write. Under Bazel, it's often useful to write your own Test Runner rather than build your test in some existing test/assertion framework, since Bazel handles all the mechanics of including your test in the build process.
Control over the test process is largely left to the user's Bazel command-line flags. This includes things like:
--test_args- arguments to forward to the test runner's CLI
--test_env- which environment variables should be included in the test's inputs (and cache key)
--test_output=streamed- very useful to say "stream the log" so it's similar to watching the test runner CLI run
--test_output=errors- typical to get Bazel to print the test failures to the stdout, otherwise you can only get them from the log in
The Test Encyclopedia is described as "an exhaustive specification of the test execution environment" and they really mean that.
This contract between Bazel and your test runner process will often resolve a dispute over why someone's test isn't working the way they expect under Bazel. Please take a minute to scan that document right now. Really, just knowing what questions it answers can save you a bunch of time later. We'll wait.
$ bazel docs test-encyclopedia
Tests have a
Size is an architectural hint. It implies a timeout, and the scheduler also reserves more RAM.
- small = unit test
- medium = functional test
- large = integration
- enormous = e2e
The timeout is set based on the size.
Size is useful for filtering, e.g. you can use
--test_size_filters=smallto ask bazel to "just run the unit tests".
You can also filter with these flags:
--test_tag_filters, e.g. with
=smoketo run tests with a custom tag "smoke"
--test_timeout_filters, e.g. with
=-eternalto skip running tests that take over 15min
--test_lang_filters, e.g. with
=js,goto run just the
timeout has an undesirable default value of
It should have been the shortest one, so that developers are reminded to increase it when a test times out.
To help get "correct" timeout values, we recommend always setting
.bazelrc so that "timeout too long" messages are provided to developers.
You can add these to the
tags attribute of a test to change the way Bazel runs it.
external- the test is intentionally non-hermetic, as it tests something aside from its declared inputs. Forces the test to be unconditionally executed, regardless of the value of
--cache_test_results. so the result is not cached.
exclusive- the test is not isolated, and can interact with other tests running at the same time. Exclusive tests are executed in serial fashion after all build activity and non-exclusive tests have been completed. They can't be run remotely either.
manual- essentially "skip" or "disabled" - it means a target wildcard pattern like
/...won't include this target. You can still run it by listing the target explicitly.
requires-network- declare that the test should run in a sandbox that allows network access.
flaky- run it up to three times when it fails. See section below.
bazel help tags
bazel help test
You can run
bazel coverage to collect coverage data for supported languages/test runners.
Bazel will combine coverage reports from multiple test runners.
This area of Bazel doesn't work very well for many languages.
coverage verb sets flags which cause the analysis cache to be discarded, which can cost minutes in CI. Consider using the equivalent
bazel test --collect_coverage --instrumentation_filter=^//
Bazel assumes that test runners can produce a test-case level reporting output in the "JUnit XML" format.
These are collected in the
Other test outputs
You might want to grab screenshots or other output files that a test generates. Bazel doesn't allow tests to produce outputs the way build steps can, since tests are not build "Actions" but rather just some program being run.
You can read the environment variable
TEST_UNDECLARED_OUTPUTS_DIR from your test, and write files into that folder. After Bazel finishes, you can collect the results as zip files from the
Tests often want to connect to services/datasources as part of the "system under test". Many builds are setup to do this outside the build tool, like so:
- Start up some services, or populate a database
- Run the entry point for the testing tool
- Clean up
Bazel does not support this model. Bazel tests are just programs that exit 0 or not, and Bazel has no "lifecycle" hooks to run some setup or teardown for specific test targets.
Of course, you could just script around Bazel, the same as in the scenario above, by starting some services before running
bazel test and then shutting them down at the end.
However this doesn't work well with remote execution. It also assumes that concurrent tests will be isolated from each other when accessing the shared resource, and means you'll startup the services even if Bazel doesn't execute any tests because they are cache hits.
Ideally, tests are hermetic. That means they depend only on declared inputs, which are files. If a test needs to connect to a service, you could invert the above model - the testing tool runs the test, which sets up the environment and tears it down. Testcontainers is a great library for using Docker containers as a part of the system under test.
Ideally, engineers would write deterministic tests. Not only is that unlikely to happen, it's sometimes not the best use of their time. What we all really want is for a passing test to mean everything is good, and for a failing test to not waste our time, assuming the infrastructure can get it to pass with some retries.
Bazel returns a special
FLAKY status when a test has a mix of fail and pass.
There are two reasonable approaches for CI:
--flaky_test_attempts=[number]commonly with a value like 2 or 3. This will run any test 1-2 additional times if it fails. This is nice since - you don't have to tell Bazel which tests are flaky ahead of time - only CI will do the retries, while developers will locally see a failure which might motivate them to fix the problem However the downside: it increases the time to report an actual failing test to 2-3x the test's runtime.
Allow a single failure of a test to fail the build, then tag the test target with
flaky = True. Bazel will run a flaky test up to two additional times after the first failure. The downside is that the version control system becomes the database of which tests are flaky, and the database needs to be maintained manually. We recommend giving the buildcop a one-click way to mark a test as flaky (or remove it) by making a bot commit to the repo which uses buildozer to make the BUILD file edit. We are building a GitHub bot which does exactly this.
Determining if flakiness is fixed
When fixing a flaky test, it can be hard to know that the fix is right, since it passes sometimes with the bug.
If the test's non-determinism can be reproduced locally by running it a few times, then use the flag
--runs_per_test=[number] to "roll the dice" this many times.
Try adding targets to the
BUILD.bazel files to run some unit test for the language you're working in.
If there isn't already a test, try adding one, using a test runner you're familiar with.