Structuring a Monorepo with Packages
Now we'll start to add Bazel configuration for the first-party code in our repository.
By the end of this section, you'll be able to run bazel query
to explore Bazel's dependency and action graphs for at least the language you picked.
If you weren't able to get things working in the previous section, you can advance your codelabs
repo to the dependencies
branch, e.g. with git reset --hard dependencies
Remember the teaching assistants are here to answer questions one-on-one to keep you unstuck!
Concepts
A "Bazel package" is a filesystem tree rooted at a BUILD
or BUILD.bazel
file.
root/ # This is the "//" package
├── BUILD.bazel
├── ...
├── animations # Here is the "//animations" package
│ ├── BUILD.bazel
│ ├── browser # Here is "//animations/browser"
│ │ ├── BUILD.bazel
│ │ └── ...
| | ...
│ ├── src
│ │ └── index.ts
│ ├── test
│ │ ├── BUILD.bazel
│ │ └── ...
Importantly, observe how animations/src/index.ts
is in the animations
package.
Packages are encapsulated.
glob
doesn't cross them (!!)- Sources in a package aren't visible outside without
exports_files
- Outputs must be written within the same package
We prefer BUILD.bazel
over BUILD
to be more explicit,
to allow for tooling to select with *.bazel
,
and to avoid colliding with a directory named build
on case-insensitive systems.
Labels
A "Label" is a string identifier that refers to a source file, an output file, or a target.
Example:
┌ package name ┐
v v
@angular//animations/utils:draw_circle
^ ^
| └ target
└ repository name (optional)
Label shorthand
If the working directory is in the same workspace, //animations/util:draw_circle
//
means the root of that workspace.- On the command line, labels can be relative to the working directory
- Each package has a default label, named after the package
You can usually use this shorthand to save typing.
For example you could just cd backend; bazel run devserver
rather than
bazel run //backend/devserver:devserver
.
Every package should have a nice default target, to save typing and make an ergonomic experience
for developers interacting with Bazel in your project.
You can use alias
to introduce an indirection, for example if you'd like users to be able to
bazel run backend
from the repository root, then you'd add an alias
:
alias(
name = "backend", # the default target for the backend package
actual = "//backend/devserver",
)
Run bazel help target-syntax
Starlark
Starlark is a python-ish language used by Bazel, buck, tilt, and many other tools. There are Java, Go, and Rust implementations of the interpreter.
The spec is surprisingly readable, and explains things like how the execution model is guaranteed to allow parallel evaluation. Read: https://github.com/bazelbuild/starlark/blob/master/spec.md
BUILD.bazel
files are written in a subset of Starlark.
Bazel extensions, written in *.bzl
files, use the full Starlark language.
Anatomy of a BUILD file
load
statements
load
statements should appear at the top of the file.
They import symbols into file scope.
The first argument is a label of a .bzl
source file, and following arguments are symbol(s) to load.
load("@aspect_bazel_lib//lib:write_source_files.bzl", "write_source_files")
You can alias a symbol on load, to avoid collisions:
load("@npm//:typescript/package_json.bzl", typescript_bin = "bin")
package
statement
Optionally, you can define defaults for all targets in the BUILD
file:
package(default_visibility = ["//visibility:public"])
Target declarations
A "rule" is like a constructor, creating a target. These are "bare facts": they only describe the source files and their dependencies. They do not say what to do. The rule implementation is responsible for instructing Bazel what build steps are required.
ts_project(
name = "compile",
srcs = ["index.ts"],
tsconfig = "//src:tsconfig",
data = ["my.json"],
deps = [":node_modules/color"],
)
The arguments to the ts_project
rule are called "attributes". Some are common for most rules:
name
is always required; we'll use this in a label to refer to the targetsrcs
typically means files in the source tree which are grouped togetherdeps
typically means other targets, either 1p or 3p, needed at build timedata
is likedeps
but is only needed at runtime
Other attributes are particular to the rule implementation.
tsconfig
is an attribute specific tots_project
which tells Bazel where the config file is
Writing BUILD files
Typically, srcs
is "all the files in this package of the given kind".
In the example above we'd always want it to contain all the *.ts
files.
The glob
function allows you to skip file listing, for example with srcs = glob(["*.ts"])
.
However, the glob must be evaluated every time Bazel loads the file, and so it incurs a performance penalty, especially as the number of files in the package grows.
It also doesn't descend into sub-packages, so it's easy to omit files by accident.
Typically, deps
can be determined by looking at all the import
statements in the srcs
and mapping them to labels which provide that symbol.
Of course this can be automated, so that BUILD files are largely machine-maintained.
You can learn about Gazelle which is a framework for BUILD file generators.
bazel configure
So far this works for Go and Protobuf. With the 'pro' version it also includes JavaScript/TypeScript. We are still adding more languages, so check back.
Try it: create BUILD
files
The schema folder has our protobuf definition, and you'll want to depend on that from other folders. Since we use Aspect CLI, you can just run bazel configure
to get a BUILD file generated for our logger.proto
.
Next, for the language you've chosen, try creating the BUILD.bazel
file.
Since our examples are simple and only have a single source file or folder of source files,
you'll only need one for each language.
JavaScript
Use ts_project
to typecheck and transpile to JavaScript, then create a http_server_binary
for the http-server
npm package and call it using js_run_devserver
.
Java
The java_grpc_library
isn't available under bzlmod yet,
so we've just stashed the resulting generated code next to the sources.
Go
bazel configure
or Gazelle should work 100%.
Python
A gazelle extension is available from rules_python, but a three-line py_binary
is enough to make the CLI work.
Machine-editing BUILD files
You can use buildozer
to script around printing and modifying BUILD file content, which is an essential skill for doing
repository-wide refactorings.
Buildozer is purely syntactic, operating on the starlark Abstract Syntax Tree (AST). This can be convenient if you want to see what the user typed, before loading and macro expansion occur. It's also guaranteed to be fast, while loading might take a long time.
There's a dedicated 'print' command to make this feature easier to access:
bazel print //some:target
The Dependency Graph
In the Loading phase, Bazel loads all of the BUILD.bazel
files needed for whatever target(s) or target patterns you request.
These targets form a Directed Acyclic Graph (DAG) called the "dependency graph".
Try it: bazel cquery
cquery
means "configured query" which is typically what you want.
The unconfigured query command is just bare query
.
Unfortunately the shorter name is taken by the less-useful command.
Here are some things you can try:
- What are all the binary targets in the repo?
- Draw a diagram like the one above for the language you're working with.
- Why does your binary depend on a particular third-party library?
Actions
When Bazel needs to transform inputs to outputs, it does it by spawning an Action, which is just a subprocess invoking some tool.
The Action Graph
In the Analysis phase, the dependency graph is "lowered" to an action graph. In the action graph, each node is a subprocess to spawn (invoking some tool) and the edges are files and Providers which are output by one action and needed as inputs to another.
The graphs are NOT one-to-one!
For example, a ts_project
rule with a custom transpiler produces several actions.
Try it: bazel analyze
Bazel doesn't actually have a command called analyze
.
It's spelled "build --nobuild" instead.
This command is rarely useful.
You might use it if you're making a big, breaking refactoring, so that you can resolve all the analysis failures first before attempting to build anything.
You could also use it to reason about what is the slow step in your CI pipeline.
bazel build --nobuild //...
Querying the action graph
This is a valuable skill when debugging a failure of some rule, especially when required inputs aren't declared.
You can run arbitrary starlark programs on the action graph with --output=starlark
which is a powerful tool.
Try it: bazel aquery
- What are the declared input files to the compile action for a library target you've created?
- What providers are produced by the library target? (You'll need a tiny starlark program)