CUDA rules for Bazel

This repository contains Starlark implementation of CUDA rules for Bazel.

These rules provide a set of rules and macros that make it easier to build CUDA with Bazel.

Getting Started

Bzlmod

Add the following to your MODULE.bazel file and replace the placeholders with actual values.

bazel_dep(name = "rules_cc", version = "{rules_cc_version}")
bazel_dep(name = "rules_cuda", version = "0.2.5")

# pick a specific version (this is optional and can be skipped)
archive_override(
    module_name = "rules_cuda",
    integrity = "{SRI value}",  # see https://developer.mozilla.org/en-US/docs/Web/Security/Subresource_Integrity
    url = "https://github.com/bazel-contrib/rules_cuda/archive/{git_commit_hash}.tar.gz",
    strip_prefix = "rules_cuda-{git_commit_hash}",
)

cuda = use_extension("@rules_cuda//cuda:extensions.bzl", "toolchain")
cuda.toolkit(
    name = "cuda",
    toolkit_path = "",
)
use_repo(cuda, "cuda")

rules_cc provides the C++ toolchain dependency for rules_cuda; in Bzlmod, the compatibility repository is handled by rules_cc itself.

Traditional WORKSPACE approach

### Traditional WORKSPACE approach Add the following to your `WORKSPACE` file and replace the placeholders with actual values.

load("@bazel_tools//tools/build_defs/repo:http.bzl", "http_archive")

http_archive(
    name = "rules_cc",
    sha256 = "{rules_cc_sha256}",
    strip_prefix = "rules_cc-{rules_cc_version}",
    urls = ["https://github.com/bazelbuild/rules_cc/releases/download/{rules_cc_version}/rules_cc-{rules_cc_version}.tar.gz"],
)
load("@rules_cc//cc:extensions.bzl", "compatibility_proxy_repo")
compatibility_proxy_repo()

http_archive(
    name = "rules_cuda",
    sha256 = "{sha256_to_replace}",
    strip_prefix = "rules_cuda-{git_commit_hash}",
    urls = ["https://github.com/bazel-contrib/rules_cuda/archive/{git_commit_hash}.tar.gz"],
)
load("@rules_cuda//cuda:repositories.bzl", "rules_cuda_dependencies", "rules_cuda_toolchains")
rules_cuda_dependencies()
rules_cuda_toolchains(register_toolchains = True)

`rules_cc` needs to be available before loading `rules_cuda`, and `compatibility_proxy_repo()` must be called to populate the compatibility repository that `rules_cc` expects. **NOTE**: `rules_cuda_toolchains` implicitly calls `register_detected_cuda_toolchains`, and the use of `register_detected_cuda_toolchains` depends on the auto-detection of installed CUDA toolkits.

Toolchain Detection

For hermetic toolchains, the rules handle toolchain configuration and library downloading automatically. See cuda.redist_json integration test for a comprehensible example.

For locally installed toolchains, _detect_local_cuda_toolkit and detect_clang determines how they are detected.

Either situation depends on cc toolchain availability, so you must also ensure the cc compiler is properly configured. On Windows, this means that you will also need to set the environment variable BAZEL_VC properly.

Rules

cuda_library: Can be used to compile and create static library for CUDA kernel code. The resulting targets can be consumed by C/C++ Rules.
cuda_objects: If you don't understand what device link means, you must never use it. This rule produces incomplete object files that can only be consumed by cuda_library. It is created for relocatable device code and device link time optimization source files.

Macros

cuda_binary: A convenience macro for building CUDA-enabled executables. It builds a cc_binary-style target from CUDA sources.
cuda_test: A convenience macro for CUDA-enabled tests. It behaves like cuda_binary but creates a cc_test-style target that can be run with bazel test.

Flags

Some flags are defined in cuda/BUILD.bazel. To use them, for example:

bazel build --@rules_cuda//cuda:archs=compute_61:compute_61,sm_61

In .bazelrc file, you can define a shortcut alias for the flag, for example:

# Convenient flag shortcuts.
build --flag_alias=cuda_archs=@rules_cuda//cuda:archs

and then you can use it as follows:

bazel build --cuda_archs=compute_61:compute_61,sm_61

Available flags

@rules_cuda//cuda:enable

Enable or disable all rules_cuda related rules. When disabled, the detected CUDA toolchains will also be disabled to avoid potential human error. By default, rules_cuda rules are enabled. See examples/if_cuda for how to support both cuda-enabled and cuda-free builds.

@rules_cuda//cuda:archs

Select the CUDA archs to support. See cuda_archs specification DSL grammar.

@rules_cuda//cuda:compiler

Select the CUDA compiler; available options are nvcc or clang.

@rules_cuda//cuda:copts

Add copts to all CUDA compile actions.

@rules_cuda//cuda:host_copts

Add copts to the host compiler.

@rules_cuda//cuda:runtime

Set the default cudart to link; for example, --@rules_cuda//cuda:runtime=@cuda//:cuda_runtime_static links the static CUDA runtime.

--features=cuda_device_debug

Sets nvcc flags to enable debug information in device code. Currently ignored for clang, where --compilation_mode=debug applies to both host and device code.

Examples

Check out the examples to see if they fit your needs.

See examples for basic usage.

See rules_cuda_examples for extended real-world projects.

Known issue

Sometimes the following error occurs:

cc1plus: fatal error: /tmp/tmpxft_00000002_00000019-2.cpp: No such file or directory

The problem is caused by nvcc using PIDs to determine temporary file names, and with --spawn_strategy linux-sandbox, which is the default strategy on Linux, the PIDs nvcc sees are all very small numbers (say 2~4) due to sandboxing. linux-sandbox is not hermetic because it mounts root into the sandbox, so /tmp is shared between sandboxes, which causes name conflicts under high parallelism. A similar problem has been reported on the NVIDIA forums.

To avoid it:

Update to Bazel 7 where --incompatible_sandbox_hermetic_tmp is enabled by default.
Using --spawn_strategy local should eliminate the case because it lets nvcc see the true PIDs.
Using --experimental_use_hermetic_linux_sandbox should eliminate the case because it avoids sharing /tmp.
Adding the -objtemp option should reduce the chance of this happening.