Notes on cross-compiling Rust

One of my current hobby projects involves running Rust binaries on a Raspberry Pi. There are three computers involved: the Pi itself (ARMv7 Linux), my desktop (x86-64 Linux), and sometimes my laptop (x86-64 macOS).

The release of Cyberpunk 2077 means that my desktop will be spending more time booted into Windows, so I needed to figure out how to get the macOS machine to build binaries for ARMv7 Linux. I had hoped this would be straightforward because rustc is a native cross-compiler, and I've had good experiences with cross-compiling other modern languages (e.g. Go).

Unfortunately when I did a websearch for [cross-compiling rust] the results were universally terrible[1]. This page contains my notes on how to get cross-compilation working with either Cargo or Bazel, plus some suggestions for the rustup and rules_rust projects that could make cross-compilation simpler in the future.

Background

In the early days of software engineering, when high-level languages like C were just starting to displace assembly, compilers used build-time configuration to select a target platform. This meant that any given build of the compiler could only generate object code for a single platform. The concept of cross-compilation was introduced to describe compilers that could be built to run on Platform A but generate object code for Platform B.

Times change, and nowadays every major compiler[2] is what's called a "native cross compiler", allowing the target platform to be selected at runtime (e.g. with a CLI flag). This includes the Rust compiler rustc, which as of v1.48 supports well over a hundred distinct targets.

rustc --version
# rustc 1.48.0 (7eac88abb 2020-11-16)
rustc --print target-list | wc -l
# 156
rustc --print target-list | sort -R | head -n 10 | sort
# aarch64-apple-darwin
# i686-uwp-windows-msvc
# msp430-none-elf
# powerpc-unknown-linux-gnuspe
# powerpc-wrs-vxworks
# sparc64-unknown-linux-gnu
# sparc64-unknown-openbsd
# thumbv4t-none-eabi
# thumbv7a-pc-windows-msvc
# x86_64-pc-windows-msvc

In practice cross-compilation requires more than simply generating object code, but with a bit of effort from the toolchain developers it's possible to make this nearly seamless. Go is the gold standard here; it ships its own linker and the sources for its standard library, so a normal installation can directly build executables for any supported target.

Rustup and Cargo

The first build tool I tried is Cargo, which I installed with rustup. I dislike building with Cargo because it's primitive and inflexible, but since it's the official Rust build tool I hoped it would be the best documented.

# Cargo.toml

[package]
name = "helloworld"
version = "0.0.1"
edition = "2018"

[[bin]]
name = "helloworld"
path = "helloworld.rs"

Cargo uses the --target flag to enable cross-compilation.

cargo build --target armv7-unknown-linux-gnueabihf
#    Compiling helloworld v0.0.1 (/Users/john/src/rust-cross-compilation)
# error[E0463]: can't find crate for `std`
#   |
#   = note: the `armv7-unknown-linux-gnueabihf` target may not be installed

Whereas Go will build its standard library from source when cross-compiling, Rust relies on precompiled libraries[3]. We can use rustup to fetch a prebuilt std for Linux on ARMv7.

rustup target add armv7-unknown-linux-gnueabihf
# info: downloading component 'rust-std' for 'armv7-unknown-linux-gnueabihf'
# info: installing component 'rust-std' for 'armv7-unknown-linux-gnueabihf'
# info: using up to 500.0 MiB of RAM to unpack components
#  18.2 MiB /  18.2 MiB (100 %)  11.5 MiB/s in  1s ETA:  0s

cargo build --target armv7-unknown-linux-gnueabihf
#    Compiling helloworld v0.0.1 (/Users/john/src/rust-cross-compilation)
# error: linking with `cc` failed: exit code: 1
#   |
#   = note: "cc" "-Wl,--as-needed" "-Wl,-z,noexecstack" "-Wl,--eh-frame-hdr" "-L"
#   [...]
#   "-Wl,-Bdynamic" "-lgcc_s" "-lc" "-lm" "-lrt" "-lpthread" "-lutil" "-ldl" "-lutil"
#   = note: clang: warning: argument unused during compilation: '-pie' [-Wunused-command-line-argument]
#           ld: unknown option: --as-needed
#           clang: error: linker command failed with exit code 1 (use -v to see invocation)

The source file was successfully compiled, but it couldn't be linked into an executable. It looks like Cargo is trying to use the host system's linker, which will sometimes work, but fails in this particular case because the macOS linker only supports Apple targets.

Luckily the LLVM project, in addition to the compilation framework, also distributes the cross-platform LLD linker. While it doesn't cover every platform supported by rustc, it does support the common ones. We can configure Cargo to use it for linking our ARMv7 Linux binary.

I downloaded clang+llvm-11.0.0-x86_64-apple-darwin.tar.xz from https://releases.llvm.org/download.html and extracted it to ~/.opt/, then added a .cargo/config.toml to my workspace.

# .cargo/config.toml
[build]

[target.armv7-unknown-linux-gnueabihf]
linker = "/Users/john/.opt/clang+llvm-11.0.0-x86_64-apple-darwin/bin/lld"

cargo build --target armv7-unknown-linux-gnueabihf
#    Compiling helloworld v0.0.1 (/Users/john/src/rust-cross-compilation)
# error: linking with `/Users/john/.opt/clang+llvm-11.0.0-x86_64-apple-darwin/bin/lld` failed: exit code: 1
#   |
#   = note: "/Users/john/.opt/clang+llvm-11.0.0-x86_64-apple-darwin/bin/lld" "-flavor" "gnu" "--eh-frame-hdr" "-L"
#   [...]
#    "-Bdynamic" "-lgcc_s" "-lc" "-lm" "-lrt" "-lpthread" "-lutil" "-ldl" "-lutil"
#   = note: lld: error: unable to find library -lgcc_s
#           lld: error: unable to find library -lc
#           lld: error: unable to find library -lm
#           lld: error: unable to find library -lrt
#           lld: error: unable to find library -lpthread
#           lld: error: unable to find library -lutil
#           lld: error: unable to find library -ldl
#           lld: error: unable to find library -lutil

Getting closer!

The linker is being told to build an executable that dynamically links against the GNU libc, which I don't have a copy of. One option here is to download it from (for example) the Ubuntu package hosting, but I don't want to do that because I don't think a Rust binary should be depending on libc at all. Rust ought to be considered a replacement for C, rather than a thin layer on top.

Therefore I'm going to switch the Cargo target to the MUSL variant, which treats libc as an implementation detail rather than a core component of the platform.

rustup target add armv7-unknown-linux-musleabihf
# info: downloading component 'rust-std' for 'armv7-unknown-linux-musleabihf'
# info: installing component 'rust-std' for 'armv7-unknown-linux-musleabihf'
# info: using up to 500.0 MiB of RAM to unpack components
#  15.8 MiB /  15.8 MiB (100 %)  12.1 MiB/s in  1s ETA:  0s

# .cargo/config.toml
[build]

[target.armv7-unknown-linux-musleabihf]
linker = "/Users/john/.opt/clang+llvm-11.0.0-x86_64-apple-darwin/bin/lld"

cargo build --target armv7-unknown-linux-musleabihf
#    Compiling helloworld v0.0.1 (/Users/john/src/rust-cross-compilation)
#     Finished dev [unoptimized + debuginfo] target(s) in 1.50s

Success! The resulting binary is a valid executable for ARMv7 Linux, and can be run as-is on the Raspberry Pi.

file target/armv7-unknown-linux-musleabihf/debug/helloworld
# target/armv7-unknown-linux-musleabihf/debug/helloworld: ELF 32-bit LSB executable, ARM, EABI5 version 1 (SYSV), statically linked, with debug_info, not stripped

Bazel

Bazel is a language-agnostic build system. Its configuration language deals in actions and dependency graphs, rather than executables and libraries, which gives it some interesting scaling properties:

  • Building single-language projects with Bazel can be more difficult than using language-specific tools.
  • Building multi-language projects is substantially easier in Bazel than in any other build system.

This makes Bazel a natural choice of build tool for any system that involves (1) FFI, (2) generated code, or (3) well-factored subsystems. It is uniquely capable when compared to Cargo because it can build multiple Rust libraries ("crates") within a single workspace.

The first step to build Rust with Bazel is to configure the WORKSPACE to depend on rules_rust. This will also define the default Rust version and edition. There's no need to install toolchains or targets, because Bazel will fetch them on demand.

# WORKSPACE
load("@bazel_tools//tools/build_defs/repo:http.bzl", "http_archive")

http_archive(
    name = "io_bazel_rules_rust",
	# HEAD commit as of 2020-12-05
    urls = ["https://github.com/bazelbuild/rules_rust/archive/67f0c5ec0397d24ccc14264a0eda86915ddf63e8.tar.gz"],
    sha256 = "c587d402e4502100b01e4ba7d9584809cf4f4eb2d2f6634097883637bfb512b1",
	strip_prefix = "rules_rust-67f0c5ec0397d24ccc14264a0eda86915ddf63e8",
)

load("@io_bazel_rules_rust//rust:repositories.bzl", "rust_repositories")

rust_repositories(
    edition = "2018",
    version = "1.48.0",
)

Next we need to create a top-level BUILD file. This will define a rust_binary target for our hello-world executable, and also a platform describing what sort of system we want to build for.

# BUILD.bazel
load("@io_bazel_rules_rust//rust:rust.bzl", "rust_binary")

rust_binary(
    name = "helloworld",
    srcs = ["helloworld.rs"],
)

platform(
    name = "linux-armv7",
    constraint_values = [
        "@platforms//os:linux",
        "@platforms//cpu:arm",
    ],
)

In the future the Platform would use a more specific "cpu:armv7" constraint (bazelbuild/rules_rust#509) and support constraining on the Rust release channel (bazelbuild/rules_rust#510).

Anyway, that should be enough, but if we try running it we'll hit an error about missing toolchains.

bazel build //:helloworld --platforms=//:linux-armv7
# [...]
# ERROR: While resolving toolchains for target //:helloworld: no matching toolchains found for types @io_bazel_rules_rust//rust:toolchain

This is because rules_rust doesn't pre-register toolchains for all supported target platforms – it makes the user register each (host, target) mapping explicitly. We need to tell rules_rust to register a toolchain that can run on macOS (Darwin) and build for ARMv7 Linux.

# WORKSPACE
load("@io_bazel_rules_rust//rust:repositories.bzl", "rust_repository_set")

rust_repository_set(
    name = "rust_linux_armv7",
    edition = "2018",
    exec_triple = "x86_64-apple-darwin",
    extra_target_triples = ["arm-unknown-linux-musleabihf"],
    rustfmt_version = "1.4.20",
    version = "1.48.0",
)

bazel build //:helloworld --platforms=//:linux-armv7
# [...]
# INFO: From Compiling Rust bin helloworld (1 files):
# error: linking with `external/local_config_cc/cc_wrapper.sh` failed: exit code: 1
#   |
#   = note: "external/local_config_cc/cc_wrapper.sh" "-Wl,--as-needed" "-Wl,-z,noexecstack" "-Wl,--eh-frame-hdr" "-nostartfiles"
#   = note: clang: warning: argument unused during compilation: '-no-pie' [-Wunused-command-line-argument]
#           ld: unknown option: --as-needed
#           clang: error: linker command failed with exit code 1 (use -v to see invocation)

This is the same linker error as we saw with Cargo, and the solution is to tell rules_rust that it should use LLD. However, there's a problem – rules_rust doesn't have its own linker toolchain, it uses the C/C++ toolchain to find a linker.

We must now contend with the Bazel C/C++ configuration system, which is designed to handle the world's wide range of strange C compilers. I'm not going to give a blow-by-blow here because none of it is relevant to Rust, but a summary is:

  • We create a new Bazel package //cc-toolchain that will contain the C/C++ configuration. I'm just going to pull in the linker from the filesystem rather than properly repository_rule it, so the toolchain file sets will be empty stubs.
  • The CcToolchainConfigInfo itself requires the path to a bunch of different tools; since the only one needed here is lld I'll hardcode the rest to /bin/false.
  • This project doesn't need to build any C/C++ code for the host (e.g. for codegen), so I'm going to override --host_crosstool_top rather than define a true host-compatible toolchain.

A more complete solution would probably involve the Clang-based toolchains defined in https://github.com/bazelbuild/bazel-toolchains.

# cc-toolchain/BUILD

load(":config.bzl", "cc_toolchain_config")

filegroup(name = "empty")

cc_toolchain_suite(
    name = "clang_suite",
    toolchains = {
        "armv7": ":armv7_toolchain",
    },
)

cc_toolchain(
    name = "armv7_toolchain",
    all_files = ":empty",
    compiler_files = ":empty",
    dwp_files = ":empty",
    linker_files = ":empty",
    objcopy_files = ":empty",
    strip_files = ":empty",
    supports_param_files = 0,
    toolchain_config = ":armv7_toolchain_config",
    toolchain_identifier = "armv7-toolchain",
)

cc_toolchain_config(name = "armv7_toolchain_config")

# cc-toolchain/config.bzl

load(
    "@bazel_tools//tools/cpp:cc_toolchain_config_lib.bzl",
    "action_config",
    "tool",
    "tool_path",
)
load(
    "@bazel_tools//tools/build_defs/cc:action_names.bzl",
    "CPP_LINK_EXECUTABLE_ACTION_NAME",
)

LLD = "/Users/john/.opt/clang+llvm-11.0.0-x86_64-apple-darwin/bin/lld"

def _cc_toolchain_config_impl(ctx):
    return cc_common.create_cc_toolchain_config_info(
        ctx = ctx,
        toolchain_identifier = "armv7-toolchain",
        host_system_name = "local",
        target_system_name = "armv7-unknown-linux-musleabihf",
        target_cpu = "armv7",
        target_libc = "unknown",
        compiler = "clang",
        abi_version = "unknown",
        abi_libc_version = "unknown",
        action_configs = [
            action_config(
                action_name = CPP_LINK_EXECUTABLE_ACTION_NAME,
                enabled = True,
                tools = [tool(path = LLD)],
            ),
        ],
        tool_paths = [
            tool_path(
                name = "ld",
                path = LLD,
            ),
            tool_path(
                name = "ar",
                path = "/usr/bin/ar",
            ),
            tool_path(
                name = "cpp",
                path = "/bin/false",
            ),
            tool_path(
                name = "gcc",
                path = "/usr/bin/clang",
            ),
            tool_path(
                name = "gcov",
                path = "/bin/false",
            ),
            tool_path(
                name = "nm",
                path = "/bin/false",
            ),
            tool_path(
                name = "objdump",
                path = "/bin/false",
            ),
            tool_path(
                name = "strip",
                path = "/bin/false",
            ),
        ],
    )

cc_toolchain_config = rule(
    implementation = _cc_toolchain_config_impl,
    attrs = {},
    provides = [CcToolchainConfigInfo],
)

Whew. With that mess dealt with, rules_rust will now link with LLD and produce valid ARMv7 Linux binaries.

bazel build //:helloworld --platforms=//:linux-armv7 \
#   --cpu=armv7 \
#   --crosstool_top=//cc-toolchain:clang_suite \
#   --host_crosstool_top=@bazel_tools//tools/cpp:toolchain
# INFO: Invocation ID: f6c497d9-48db-4240-85b5-c8bfa675c49b
# INFO: Analyzed target //:helloworld (10 packages loaded, 274 targets configured).
# INFO: Found 1 target...
# Target //:helloworld up-to-date:
# 	bazel-bin/helloworld
# INFO: Elapsed time: 33.660s, Critical Path: 0.45s
# INFO: 10 processes: 5 remote cache hit, 5 internal.
# INFO: Build completed successfully, 10 total actions

file bazel-bin/helloworld
# bazel-bin/helloworld: ELF 32-bit LSB executable, ARM, EABI5 version 1 (SYSV), statically linked, with debug_info, not stripped

Suggestions

  1. rules_rust has some work to do on making its toolchains ergonomic. Right now they couple the host binaries and target libraries into a single ToolchainInfo, which means Bazel can't resolve them separately based on host and target constraints. If they were split up (bazelbuild/rules_rust#523) then the entire set of supported targets could be pre-registered by a rust_toolchains() macro.
  2. rules_rust should decouple its linker command from the C/C++ toolchain. I shouldn't have to touch anything related to cc to get a working rustc + lld combo.
  3. Both rustup and rules_rust should integrate support for LLD. While I'm not sure if it should be the default for all platforms, it should definitely be the default (or strongly recommended) for cross-compilation.
  4. The LLVM project should offer some non-monolithic downloads for some tools, or alternatively the Rust project should host a stripped-down archive for LLD. The full LLVM binary distribution is huge and it doesn't make sense to make users download a complete copy of Clang just so they can link ELF binaries on macOS.

    du -sh clang+llvm-11.0.0-x86_64-apple-darwin/
    # 2.4G	clang+llvm-11.0.0-x86_64-apple-darwin/
    du -sh clang+llvm-11.0.0-x86_64-apple-darwin/bin/lld
    #  81M	clang+llvm-11.0.0-x86_64-apple-darwin/bin/lld

    It doesn't even use any of the bundled dylibs!

    otool -L clang+llvm-11.0.0-x86_64-apple-darwin/bin/lld
    # clang+llvm-11.0.0-x86_64-apple-darwin/bin/lld:
    # 	/usr/lib/libxml2.2.dylib (compatibility version 10.0.0, current version 10.9.0)
    # 	/usr/lib/libz.1.dylib (compatibility version 1.0.0, current version 1.2.11)
    # 	/usr/lib/libncurses.5.4.dylib (compatibility version 5.4.0, current version 5.4.0)
    # 	/usr/lib/libSystem.B.dylib (compatibility version 1.0.0, current version 1281.100.1)
    # 	/usr/lib/libc++.1.dylib (compatibility version 1.0.0, current version 902.1.0)
  5. Cross-compilation should be covered by official Rust documentation. The Rust Book's maintainers have declined to add a chapter about it (rust-lang/book#2367), which makes me sad, but I am hopeful that it might one day be covered in the Embedded Rust book.

  1. If a tutorial on cross-compiling Rust starts off with installing Docker or Vagrant then I'm not fucking reading it. And stop linking me to rust-embedded/cross, hiding these insane dependency stacks behind a "magical" wrapper doesn't help anybody worth helping.

  2. Except for GCC, which like most GNU software chooses to remain frozen in a grotesque parody of mid-80s UNIX.

  3. I've heard this is due to the Rust standard library's dependency on libc, thus requiring a C toolchain and headers to build std for a given platform.