CUDA rules for Bazel
This repository contains Starlark implementation of CUDA rules for Bazel.
These rules provide a set of rules and macros that make it easier to build CUDA with Bazel.
Getting Started
Bzlmod
Add the following to your MODULE.bazel file and replace the placeholders with actual values.
bazel_dep(name = "rules_cc", version = "{rules_cc_version}")
bazel_dep(name = "rules_cuda", version = "0.2.5")
# pick a specific version (this is optional and can be skipped)
archive_override(
module_name = "rules_cuda",
integrity = "{SRI value}", # see https://developer.mozilla.org/en-US/docs/Web/Security/Subresource_Integrity
url = "https://github.com/bazel-contrib/rules_cuda/archive/{git_commit_hash}.tar.gz",
strip_prefix = "rules_cuda-{git_commit_hash}",
)
cuda = use_extension("@rules_cuda//cuda:extensions.bzl", "toolchain")
cuda.toolkit(
name = "cuda",
toolkit_path = "",
)
use_repo(cuda, "cuda")
rules_cc provides the C++ toolchain dependency for rules_cuda; in Bzlmod, the compatibility repository is handled by rules_cc itself.
Traditional WORKSPACE approach
### Traditional WORKSPACE approach Add the following to your `WORKSPACE` file and replace the placeholders with actual values.load("@bazel_tools//tools/build_defs/repo:http.bzl", "http_archive")
http_archive(
name = "rules_cc",
sha256 = "{rules_cc_sha256}",
strip_prefix = "rules_cc-{rules_cc_version}",
urls = ["https://github.com/bazelbuild/rules_cc/releases/download/{rules_cc_version}/rules_cc-{rules_cc_version}.tar.gz"],
)
load("@rules_cc//cc:extensions.bzl", "compatibility_proxy_repo")
compatibility_proxy_repo()
http_archive(
name = "rules_cuda",
sha256 = "{sha256_to_replace}",
strip_prefix = "rules_cuda-{git_commit_hash}",
urls = ["https://github.com/bazel-contrib/rules_cuda/archive/{git_commit_hash}.tar.gz"],
)
load("@rules_cuda//cuda:repositories.bzl", "rules_cuda_dependencies", "rules_cuda_toolchains")
rules_cuda_dependencies()
rules_cuda_toolchains(register_toolchains = True)
`rules_cc` needs to be available before loading `rules_cuda`, and `compatibility_proxy_repo()` must be called to populate the compatibility repository that `rules_cc` expects.
**NOTE**: `rules_cuda_toolchains` implicitly calls `register_detected_cuda_toolchains`, and the use of
`register_detected_cuda_toolchains` depends on the auto-detection of installed CUDA toolkits.
Toolchain Detection
For hermetic toolchains, the rules handle toolchain configuration and library downloading automatically. See cuda.redist_json integration test for a comprehensible example.
For locally installed toolchains,
_detect_local_cuda_toolkit
and detect_clang
determines how they are detected.
Either situation depends on cc toolchain availability, so you must also ensure the cc compiler is properly configured.
On Windows, this means that you will also need to set the environment variable BAZEL_VC properly.
Rules
cuda_library: Can be used to compile and create static library for CUDA kernel code. The resulting targets can be consumed by C/C++ Rules.cuda_objects: If you don't understand what device link means, you must never use it. This rule produces incomplete object files that can only be consumed bycuda_library. It is created for relocatable device code and device link time optimization source files.
Macros
cuda_binary: A convenience macro for building CUDA-enabled executables. It builds acc_binary-style target from CUDA sources.cuda_test: A convenience macro for CUDA-enabled tests. It behaves likecuda_binarybut creates acc_test-style target that can be run withbazel test.
Flags
Some flags are defined in cuda/BUILD.bazel. To use them, for example:
bazel build --@rules_cuda//cuda:archs=compute_61:compute_61,sm_61
In .bazelrc file, you can define a shortcut alias for the flag, for example:
# Convenient flag shortcuts.
build --flag_alias=cuda_archs=@rules_cuda//cuda:archs
and then you can use it as follows:
bazel build --cuda_archs=compute_61:compute_61,sm_61
Available flags
@rules_cuda//cuda:enable
Enable or disable all rules_cuda related rules. When disabled, the detected CUDA toolchains will also be disabled to avoid potential human error.
By default, rules_cuda rules are enabled. See examples/if_cuda for how to support both cuda-enabled and cuda-free builds.
@rules_cuda//cuda:archs
Select the CUDA archs to support. See cuda_archs specification DSL grammar.
@rules_cuda//cuda:compiler
Select the CUDA compiler; available options are nvcc or clang.
@rules_cuda//cuda:copts
Add copts to all CUDA compile actions.
@rules_cuda//cuda:host_copts
Add copts to the host compiler.
@rules_cuda//cuda:runtime
Set the default cudart to link; for example, --@rules_cuda//cuda:runtime=@cuda//:cuda_runtime_static links the static CUDA runtime.
--features=cuda_device_debug
Sets nvcc flags to enable debug information in device code.
Currently ignored for clang, where --compilation_mode=debug applies to both
host and device code.
Examples
Check out the examples to see if they fit your needs.
See examples for basic usage.
See rules_cuda_examples for extended real-world projects.
Known issue
Sometimes the following error occurs:
cc1plus: fatal error: /tmp/tmpxft_00000002_00000019-2.cpp: No such file or directory
The problem is caused by nvcc using PIDs to determine temporary file names, and with --spawn_strategy linux-sandbox, which is the default strategy on Linux, the PIDs nvcc sees are all very small numbers (say 2~4) due to sandboxing. linux-sandbox is not hermetic because it mounts root into the sandbox, so /tmp is shared between sandboxes, which causes name conflicts under high parallelism. A similar problem has been reported on the NVIDIA forums.
To avoid it:
- Update to Bazel 7 where
--incompatible_sandbox_hermetic_tmpis enabled by default. - Using
--spawn_strategy localshould eliminate the case because it lets nvcc see the true PIDs. - Using
--experimental_use_hermetic_linux_sandboxshould eliminate the case because it avoids sharing/tmp. - Adding the
-objtempoption should reduce the chance of this happening.