Accelerating Bazel CI/CD Pipeline Performance for Autonomous Vehicles
What is Bazel?
Bazel is an open-source build and test tool similar to Make, Maven, and Gradle. It uses a human-readable, high-level build language. Bazel supports projects in multiple languages and builds outputs for multiple platforms. Bazel supports large codebases across multiple repositories, and large numbers of users.
Why should I use Bazel?
Bazel offers the following advantages:
- Declarative language
- Reproducibility
- Scalability
- Parallel and distributed execution
- Building polyglot projects
- Extensibility
- Long-term support
- Weka’s parallel file system integration with Bazel has resulted in about a 7x performance improvement for one of our leading autonomous vehicle customers. Bazel has lots of small files, that don’t require huge capacity but a lot of metadata operations. Weka acts as a caching layer. They cache ‘artifacts’ (pieces of builds) on WekaFS, that they use for their distributing builds. Weka doesn’t hold the data, but only cached pieces. The IO pattern is 98% reads and 2% writes or as they call it in the app level 98% cache hits and 2% writes.
- Bazel’s use of runfiles trees creates very heavy demand for metadata operations. This requires a system with extremely high metadata performance before other system performance parameters can start to become relevant. Weka does a great job here. Weka also has a new feature in v3.8.1 called Adaptive caching, which emulates “local disk” performance and caching. This is expected to further enhance metadata performance.
- For remote compilation actions in particular we were able to further optimize our system thanks to Weka’s highly compliant POSIX implementation which supports OverlayFS on top of Weka. This enables us to combine a scratch local file system with the Weka file system for efficient caching of entire directories. Effectively we are “hardlinking” folders across filesystems to provide the compiler exactly the desired folder layout.
- Given the serial nature of many of the dev tasks that are managed by Bazel, the ability to reduce the time to completion of the various component tasks of a whole build is quite valuable from a time savings perspective and substantially improves developer productivity (or at least helps to eliminate an excuse for a lack of productivity as the case may be).
As a developer of build logic, you use a higher-level language called Starlark, a Python derivative. Starlark introduces an abstraction to the concepts of a build and hides its implementation complexities as much as possible. As a result, you do not have to concern yourself with low-level implementation details like compilers or linkers. Instead, you just point your build to the source code and declare dependencies. Bazel will figure out the rest. Needless to say, you can still fine-tune the compiler or linker settings if needed.
When executing builds over and over again, you do not want any surprises.
Nondeterministic behavior erodes trust in the correctness of build results. Bazel ensures a sandboxed build execution by enforcing the definition of all of its dependencies explicitly.
Bazel’s main focus is on projects with large codebases, predominantly for organizations that have decided to put all of their projects into a monorepo. It’s not a dealbreaker if you break down your projects into individual source code repositories. That’s common practice, especially if you are working on software with a microservices architecture. Bazel can handle both code organizational structures quite well.
Improvements to build performance become more apparent in larger codebases, as Bazel can execute its work in parallel and in a distributed fashion. Build execution can be performed on a single machine or distributed across multiple remote machines (e.g., located in a datacenter).
Many build tools support building only a single language or ecosystem. That’s not the case with Bazel. Bazel can handle polyglot projects. For example, it supports the JVM (Java Virtual Machine) ecosystem, native languages, and JavaScript. Furthermore, Bazel embraces modern software development methodologies like containerization of applications with Docker and deployment to orchestration engines like Kubernetes.
It’s not uncommon for projects to have custom requirements. While Bazel’s built-in support for languages and ecosystems is broad and expansive, it cannot cover every possible use case. With the help of Bazel’s extension mechanism, called rules, developers can enhance the tool’s base functionality and share it across the organization or wider community.
One of the biggest advantages to using Bazel is that Google is driving it, which means that the project benefits from years of in-house use and evolution at Google. Moreover, with Bazel’s move to go open source, it’s also backed by a dedicated team of Google developers. As a result, you can expect bug fixes, new features, and long-term support. The latter was confirmed explicitly in the 1.0 release announcement.
Google’s Bazel build tool is emerging as the leading solution for fast and correct large-scale software build automation. While Bazel is suitable for most large/complex build scenarios, it had been adopted with particular enthusiasm in leading-edge areas like machine learning, autonomous vehicles, finance, and highly-polyglot systems.