Fetching External Dependencies
Before we can work with the code in this repository, we need the toolchains and third-party dependencies that it relies on.
By the end of this section, you should be able to run bazel fetch
to download these for the language you pick (Java, JavaScript, Go, or Python).
If you're stuck, ask the instructor or teaching assistants for help!
Concepts
A "Bazel module" is a Bazel project that can have multiple versions, each of which publishes metadata about other modules that it depends on.
"Starlark" is a dialect of Python used to configure Bazel, as well as some other tools.
A "Starlark module" is a different concept, representing a .bzl
file we can load
from.
Introduced in Bazel 6.0, "bzlmod" is the package manager for Bazel modules. Read more in the documentation: https://bazel.build/build/bzlmod#modules
Exercise: bazel fetch
We need to create the minimal boilerplate files in our repository to "install" Bazel.
% touch WORKSPACE.bazel MODULE.bazel
% echo 7.0.0rc4 > .bazelversion
Let's find a dependency to add. Go to the Web UI for the Bazel Central Registry: https://registry.bazel.build
Search for "bazel-lib", which is a basic library with some simple building blocks.
Use the button to copy the text from the "Install" codeblock and paste it in MODULE.bazel
.
Now you can ask Bazel to fetch that package:
% bazel fetch @aspect_bazel_lib//lib:all
Bazel creates a MODULE.bazel.lock
file as well, which is intended to make the module resolver
more reproducible. This file should be checked in.
Bazel also has a sync
command, but this is rarely useful and not covered here.
Language dependencies
Bzlmod delegates to other language-specific package managers.
- Python: pip
- JavaScript: pnpm
- Java: Coursier
- and so on
:::exception C++ doesn't really have a popular "package manager", so the Bazel Central Registry is accidentally becoming one. :::
In most cases, under Bazel we'll still use the canonical files for declaring these dependencies, though under Bazel they should always be "pinned" for reproducible builds. We want to preserve interoperability with existing tools as much as possible, such as editors and static analysis tools, and these understand the idiomatic files for the language.
- "pinned" means direct and transitive dependency versions are always exactly specified
- can include integrity hashes for supply chain security
The steps to do this vary a bit between languages, but they all have the following rough outline:
- You may leave the developer's constraints alone, but do use semver ranges. For example:
- our
frontend/package.json
allows any version ofhttp-server
- our requirements.txt allows any version of
requests
- our
- Pin transitive dependencies to a constant version
- These are generally written to a separate "lock" file.
- Mirror that dependency list into Starlark
- This allows Bazel to manage the dependencies itself.
- Add code to expose external repositories for use by
BUILD
targets- The instructions for each language should tell you how to do this.
In practice, you'll find that not all rules do a good job of documenting bzlmod usage yet. You can get a hint by finding the tests for a ruleset.
On https://registry.bazel.build, click the "View registry source" link for a module, and open the presubmit.yml
file.
You'll find a path to some subfolder where a test lives.
These are executable examples, so they give us a clue how the module is used.
For example,
bcr_test_module:
module_path: 'e2e/bzlmod'
Then you'd navigate to the /e2e/bzlmod
folder in the ruleset repo, and there will be something that is guaranteed to work.
Exercise: pin the transitive dependencies and mirror into Starlark
Bazel's reproducibility can only be as good as the information it's given. Each external package manager has a feature to pin the dependencies.
Your goal is to produce the following files, for the languages you care about:
go.mod
->go.sum
package.json
->pnpm-lock.yaml
requirements.txt
->requirements_lock.txt
- Java sources ->
maven_install.json
Package.swift
->Package.resolved
You'll have to read the documentation for the ruleset you use to figure out an approach to do this,
and also add to the MODULE.bazel
file by searching the Registry and following rules install instructions.
This is the most challenging part of Bazel setup, and only happens once per repo. Don't worry if you can't get this to work. We'll go through one example together.
Go
See https://github.com/bazelbuild/bazel-gazelle#update-repos.
However, instead of adding go_repository
rules to WORKSPACE
, we need to add go_deps.module
calls to MODULE.bazel
.
Java
See @maven//:pin
Python
See pip_parse
JavaScript
rules_js can read the pnpm-lock.yaml
file directly and do this at runtime. However if you want, you can have Bazel update the pnpm-lock.yaml
file for you, then you'll check in a dependency file.
Swift
See rules_swift_package_manager's quickstart
- Configure your
MODULE.bazel
. - The codelabs repository already includes a
Pacakge.swift
. Otherwise, create a minimalPackage.swift
file. - Add Gazelle targets to the
BUILD.bazel
at the root of your workspace. - Resolve the Swift package dependencies for your project.
Configuring the downloader
Bazel's downloader is full-featured, and you can use it to block undesired network access, fetch via your corporate proxy or artifact repository, and more.
Eager fetches
Developers shouldn't need to fetch things they don't use. For example, a developer in one language shouldn't be blocked waiting to download toolchains for some other language.
load
is eagerThe load
statement in Starlark happens eagerly during the Loading phase, and causes things to be eagerly fetched.
This de-optimization is easily introduced, and typically is only diagnosed when developers complain about "slow initial builds".
In WORKSPACE
/MODULE.bazel
These happen for every single build regardless of the dependency graph or which targets the user requests. Bazel must evaluate the complete WORKSPACE and MODULE.bazel files to understand what third-party dependencies exist for the build. Let's say the WORKSPACE file contains this content:
load("@rules_python//python:pip.bzl", "pip_parse")
pip_parse(
name = "my_deps",
requirements_lock = "//path/to:requirements_lock.txt",
)
load("@my_deps//:requirements.bzl", "install_deps")
install_deps()
Because the highlighted line has a load
statement, the my_deps
repository is requested at loading time,
and so the pip_parse
implementation will run.
If it uses a hermetic python interpreter, then that interpreter must be built or fetched for any build.
In BUILD.bazel
In this example, a BUILD file loads from @npm
:
load("@npm//@bazel/typescript:index.bzl", "ts_project")
package(default_visibility = ["//visibility:public"])
ts_project(
name = "a",
srcs = glob(["*.ts"]),
declaration = True,
tsconfig = "//:tsconfig.json",
deps = [
"@npm//@types/node",
"@npm//tslib",
],
)
filegroup(name = "b")
Even if a developer only asks Bazel to build the filegroup b
, the load
statement means that the
@npm
repository must be fetched.
Exercise: Fetch
Let's verify we have one language working, by using bazel fetch
to get one of the dependencies.