Skip to main content

Structuring a Monorepo with Packages

Now we'll start to add Bazel configuration for the first-party code in our repository.

By the end of this section, you'll be able to run bazel query to explore Bazel's dependency and action graphs for at least the language you picked.

If you weren't able to get things working in the previous section, you can advance your codelabs repo to the dependencies branch, e.g. with git reset --hard dependencies Remember the teaching assistants are here to answer questions one-on-one to keep you unstuck!

Concepts

Definition

A "Bazel package" is a filesystem tree rooted at a BUILD or BUILD.bazel file.

root/             # This is the "//" package
├── BUILD.bazel
├── ...
├── animations # Here is the "//animations" package
│ ├── BUILD.bazel
│ ├── browser # Here is "//animations/browser"
│ │ ├── BUILD.bazel
│ │ └── ...
| | ...
│ ├── src
│ │ └── index.ts
│ ├── test
│ │ ├── BUILD.bazel
│ │ └── ...

Importantly, observe how animations/src/index.ts is in the animations package.

Packages are encapsulated.

  • glob doesn't cross them (!!)
  • Sources in a package aren't visible outside without exports_files
  • Outputs must be written within the same package
Naming

We prefer BUILD.bazel over BUILD to be more explicit, to allow for tooling to select with *.bazel, and to avoid colliding with a directory named build on case-insensitive systems.

Labels

Definition

A "Label" is a string identifier that refers to a source file, an output file, or a target.

Example:

          ┌ package name ┐
v v
@angular//animations/utils:draw_circle
^ ^
| └ target
└ repository name (optional)

Label shorthand

If the working directory is in the same workspace, //animations/util:draw_circle

  • // means the root of that workspace.
  • On the command line, labels can be relative to the working directory
  • Each package has a default label, named after the package

You can usually use this shorthand to save typing. For example you could just cd backend; bazel run devserver rather than bazel run //backend/devserver:devserver.

Every package should have a nice default target, to save typing and make an ergonomic experience for developers interacting with Bazel in your project. You can use alias to introduce an indirection, for example if you'd like users to be able to bazel run backend from the repository root, then you'd add an alias:

backend/BUILD.bazel
alias(
name = "backend", # the default target for the backend package
actual = "//backend/devserver",
)
note

Run bazel help target-syntax

Starlark

Starlark is a python-ish language used by Bazel, buck, tilt, and many other tools. There are Java, Go, and Rust implementations of the interpreter.

The spec is surprisingly readable, and explains things like how the execution model is guaranteed to allow parallel evaluation. Read: https://github.com/bazelbuild/starlark/blob/master/spec.md

BUILD.bazel files are written in a subset of Starlark. Bazel extensions, written in *.bzl files, use the full Starlark language.

Anatomy of a BUILD file

load statements

load statements should appear at the top of the file. They import symbols into file scope.

The first argument is a label of a .bzl source file, and following arguments are symbol(s) to load.

load("@aspect_bazel_lib//lib:write_source_files.bzl", "write_source_files")

You can alias a symbol on load, to avoid collisions:

load("@npm//:typescript/package_json.bzl", typescript_bin = "bin")

package statement

Optionally, you can define defaults for all targets in the BUILD file:

package(default_visibility = ["//visibility:public"])

Target declarations

A "rule" is like a constructor, creating a target. These are "bare facts": they only describe the source files and their dependencies. They do not say what to do. The rule implementation is responsible for instructing Bazel what build steps are required.

ts_project(
name = "compile",
srcs = ["index.ts"],
tsconfig = "//src:tsconfig",
data = ["my.json"],
deps = [":node_modules/color"],
)

The arguments to the ts_project rule are called "attributes". Some are common for most rules:

  • name is always required; we'll use this in a label to refer to the target
  • srcs typically means files in the source tree which are grouped together
  • deps typically means other targets, either 1p or 3p, needed at build time
  • data is like deps but is only needed at runtime

Other attributes are particular to the rule implementation.

  • tsconfig is an attribute specific to ts_project which tells Bazel where the config file is

Writing BUILD files

Typically, srcs is "all the files in this package of the given kind". In the example above we'd always want it to contain all the *.ts files.

caution

The glob function allows you to skip file listing, for example with srcs = glob(["*.ts"]).

However, the glob must be evaluated every time Bazel loads the file, and so it incurs a performance penalty, especially as the number of files in the package grows.

It also doesn't descend into sub-packages, so it's easy to omit files by accident.

Typically, deps can be determined by looking at all the import statements in the srcs and mapping them to labels which provide that symbol.

Of course this can be automated, so that BUILD files are largely machine-maintained.

You can learn about Gazelle which is a framework for BUILD file generators.

Aspect CLI only
bazel configure

So far this works for Go and Protobuf. With the 'pro' version it also includes JavaScript/TypeScript. We are still adding more languages, so check back.

Try it: create BUILD files

The schema folder has our protobuf definition, and you'll want to depend on that from other folders. Since we use Aspect CLI, you can just run bazel configure to get a BUILD file generated for our logger.proto.

Next, for the language you've chosen, try creating the BUILD.bazel file. Since our examples are simple and only have a single source file or folder of source files, you'll only need one for each language.

JavaScript

Use ts_project to typecheck and transpile to JavaScript, then create a http_server_binary for the http-server npm package and call it using js_run_devserver.

Java

The java_grpc_library isn't available under bzlmod yet, so we've just stashed the resulting generated code next to the sources.

Go

bazel configure or Gazelle should work 100%.

Python

A gazelle extension is available from rules_python, but a three-line py_binary is enough to make the CLI work.

Machine-editing BUILD files

You can use buildozer to script around printing and modifying BUILD file content, which is an essential skill for doing repository-wide refactorings.

Buildozer is purely syntactic, operating on the starlark Abstract Syntax Tree (AST). This can be convenient if you want to see what the user typed, before loading and macro expansion occur. It's also guaranteed to be fast, while loading might take a long time.

Aspect CLI Only

There's a dedicated 'print' command to make this feature easier to access:

bazel print //some:target

The Dependency Graph

In the Loading phase, Bazel loads all of the BUILD.bazel files needed for whatever target(s) or target patterns you request.

These targets form a Directed Acyclic Graph (DAG) called the "dependency graph".

Result of querying Java depgraph

Try it: bazel cquery

danger

cquery means "configured query" which is typically what you want.

The unconfigured query command is just bare query. Unfortunately the shorter name is taken by the less-useful command.

Here are some things you can try:

  • What are all the binary targets in the repo?
  • Draw a diagram like the one above for the language you're working with.
  • Why does your binary depend on a particular third-party library?

Actions

When Bazel needs to transform inputs to outputs, it does it by spawning an Action, which is just a subprocess invoking some tool.

The Action Graph

In the Analysis phase, the dependency graph is "lowered" to an action graph. In the action graph, each node is a subprocess to spawn (invoking some tool) and the edges are files and Providers which are output by one action and needed as inputs to another.

The graphs are NOT one-to-one! For example, a ts_project rule with a custom transpiler produces several actions.

Try it: bazel analyze

Bazel doesn't actually have a command called analyze. It's spelled "build --nobuild" instead. This command is rarely useful. You might use it if you're making a big, breaking refactoring, so that you can resolve all the analysis failures first before attempting to build anything. You could also use it to reason about what is the slow step in your CI pipeline.

bazel build --nobuild //...

Querying the action graph

This is a valuable skill when debugging a failure of some rule, especially when required inputs aren't declared.

You can run arbitrary starlark programs on the action graph with --output=starlark which is a powerful tool.

Try it: bazel aquery

  • What are the declared input files to the compile action for a library target you've created?
  • What providers are produced by the library target? (You'll need a tiny starlark program)