Configuring CROSSTOOL
Overview
This tutorial uses an example scenario to describe how to configure CROSSTOOL
for a project. It’s based on an
example C++ project
that builds error-free using gcc, clang, and msvc.
In this tutorial, you configure a CROSSTOOL file so that Bazel can build the
application with emscripten. The expected outcome is to run
bazel build --config=asmjs test/helloworld.js on a Linux machine and build the
C++ application using emscripten
targeting asm.js.
Setting up the build environment
This tutorial assumes you are on Linux on which you have successfully built C++ applications - in other words, we assume that appropriate tooling and libraries have been installed.
Set up your build environment as follows:
-
If you have not already done so, download and install Bazel 0.19 or later.
-
Download the example C++ project from GitHub and place it in an empty directory on your local machine.
-
Add the following
cc_binarytarget to themain/BUILDfile:cc_binary( name = "helloworld.js", srcs = ["helloworld.cc"], ) -
Create a
.bazelrcfile at the root of the workspace directory with the following contents to enable the use of the--configflag:# Create a new CROSSTOOL file for our toolchain. build:asmjs --crosstool_top=//toolchain:emscripten # Use --cpu as a differentiator. build:asmjs --cpu=asmjs # Specify a "sane" C++ toolchain for the host platform. build:asmjs --host_crosstool_top=@bazel_tools//tools/cpp:toolchain
In this example, we are using the --cpu flag as a differentiator, since
emscripten can target both asmjs and Web assembly. We are not configuring a
Web assembly toolchain, however. Since Bazel uses many internal tools written in
C++, such as process-wrapper, we are specifying a “sane” C++ toolchain for the
host platform.
Configuring the C++ toolchain
To configure the C++ toolchain, repeatedly build the application and eliminate each error one by one as described below.
Note: This tutorial assumes you’re using Bazel 0.19 or later. If you’re using an older release of Bazel, the build errors listed may appear in a different order, but the configuration procedure is the same.
-
Run the build with the following command:
bazel build --config=asmjs helloworld.jsBecause you specified
--crosstool_top=//toolchain:emscriptenin the.bazelrcfile, Bazel throws the following error:No such package `toolchain`: BUILD file not found on package path.In the workspace directory, create the
toolchaindirectory for the package and an emptyBUILDfile inside thetoolchaindirectory. -
Run the build again. Because the
toolchainpackage does not yet define theemscriptentarget, Bazel throws the following error:No such target '//toolchain:emscripten': target 'emscripten' not declared in package 'toolchain' defined by .../toolchain/BUILDIn the
toolchain/BUILDfile, define an empty filegroup as follows:package(default_visibility = ['//visibility:public']) filegroup(name = "emscripten") -
Run the build again. Bazel throws the following error:
The specified --crosstool_top '//toolchain:emscripten' is not a valid cc_toolchain_suite rule.Bazel discovered that the
--crosstool_topflag does not point to thecc_toolchain_suiterule. In thetoolchain/BUILDfile, replace the empty filegroup with the following:cc_toolchain_suite( name = "emscripten", toolchains = { "asmjs": ":asmjs_toolchain", "asmjs|emscripten": ":asmjs_toolchain", }, )The
toolchainsattribute automatically maps the--cpu(and also--compilerwhen specified) values tocc_toolchain. You have not yet defined anycc_toolchaintargets and Bazel will complain about that shortly. -
Run the build again. Bazel throws the following error:
The crosstool_top you specified was resolved to '//toolchain:emscripten', which does not contain a CROSSTOOL file.Bazel expects a
CROSSTOOLfile in thetooolchain:emscriptenpackage. Create an emptyCROSSTOOLfile inside thetoolchaindirectory. -
Run the build again. Bazel throws the following error:
Could not read the crosstool configuration file 'CROSSTOOL file .../toolchain/CROSSTOOL', because of an incomplete protocol buffer (Message missing required fields: major_version, minor_version, default_target_cpu)Bazel read the
CROSSTOOLfile and found nothing inside. Populate theCROSTOOLfile as follows:major_version: "1" minor_version: "0" default_target_cpu: "asmjs" -
Run the build again. Bazel throws the following error:
The label '//toolchain:asmjs_toolchain' is not a cc_toolchain rule.This is an important milestone in which you define
cc_toolchaintargets for every toolchain in theCROSSTOOLfile. This is where you specify the files that comprise the toolchain so that Bazel can set up sandboxing. Add the following to thetoolchain/BUILDfile:filegroup(name = "empty") cc_toolchain( name = "asmjs_toolchain", toolchain_identifier = "asmjs-toolchain", all_files = ":empty", compiler_files = ":empty", cpu = "asmjs", dwp_files = ":empty", dynamic_runtime_libs = [":empty"], linker_files = ":empty", objcopy_files = ":empty", static_runtime_libs = [":empty"], strip_files = ":empty", supports_param_files = 0, ) -
Run the build again. Bazel throws the following error:
No toolchain found for cpu 'asmjs'.Since you have specified
--crosstool_topand--cpuin the.bazelrcfile,//toolchain:asmjs_toolchainis selected. Because we specifytoolchain_identifier = "asmjs-toolchain", we need to create a toolchain definition with this identifier. Add the following to theCROSTOOLfile:toolchain { toolchain_identifier: "asmjs-toolchain" host_system_name: "i686-unknown-linux-gnu" target_system_name: "asmjs-unknown-emscripten" target_cpu: "asmjs" target_libc: "unknown" compiler: "emscripten" abi_version: "unknown" abi_libc_version: "unknown" }The above definition also specifies the compiler, which you can use to more precisely select the C++ toolchain.
Because we want to omit the
--compilerflag and only use the--cpuflag, we have added aasmjskey intocc_toolchain_suite.toolchains. -
Run the build again. Bazel throws the following error:
.../BUILD:1:1: C++ compilation of rule '//:helloworld.js' failed (Exit 1) src/main/tools/linux-sandbox-pid1.cc:421: "execvp(toolchain/DUMMY_GCC_TOOL, 0x11f20e0)": No such file or directory Target //:helloworld.js failed to build`At this point, Bazel has enough information to attempt building the code but it still does not know what tools to use to complete the required build actions. Add the following to your
CROSSTOOLfile to tell Bazel what tools to use:# toolchain/CROSSTOOL # ... tool_path { name: "gcc" path: "emcc.sh" } tool_path { name: "ld" path: "emcc.sh" } tool_path { name: "ar" path: "/bin/false" } tool_path { name: "cpp" path: "/bin/false" } tool_path { name: "gcov" path: "/bin/false" } tool_path { name: "nm" path: "/bin/false" } tool_path { name: "objdump" path: "/bin/false" } tool_path { name: "strip" path: "/bin/false" }You may notice the
emcc.shwrapper script, which delegates to the externalemcc.pyfile. Create the script in thetoolchainpackage directory with the following contents and set its executable bit:#!/bin/bash set -euo pipefail python external/emscripten_toolchain/emcc.py "$@"Paths specified in the
CROSSTOOLfile are relative to the location of theCROSSTOOLfile itself.The
emcc.pyfile does not yet exist in the workspace directory. To obtain it, you can either check theemscriptentoolchain in with your project or pull it from its GitHub repository. This tutorial uses the latter approach. To pull the toolchain from the GitHub repository, add the followingnew_http_archiverepository definitions to yourWORKSPACEfile:new_http_archive( name = 'emscripten_toolchain', url = 'https://github.com/kripken/emscripten/archive/1.37.22.tar.gz', build_file = 'emscripten-toolchain.BUILD', strip_prefix = "emscripten-1.37.22", ) new_http_archive( name = 'emscripten_clang', url = 'https://s3.amazonaws.com/mozilla-games/emscripten/packages/llvm/tag/linux_64bit/emscripten-llvm-e1.37.22.tar.gz', build_file = 'emscripten-clang.BUILD', strip_prefix = "emscripten-llvm-e1.37.22", )In the workspace directory root, create the
emscripten-toolchain.BUILDandemscripten-clang.BUILDfiles that expose these repositories as filegroups and establish their visibility across the build.First create the
emscripten-toolchain.BUILDfile with the following contents:package(default_visibility = ['//visibility:public']) filegroup( name = "all", srcs = glob(["**/*"]), )Next, create the
emscripten-clang.BUILDfile with the following contents:package(default_visibility = ['//visibility:public'])` filegroup( name = "all", srcs = glob(["**/*"]), )You may notice that the targets simply parse all of the files contained in the archives pulled by the
new_http_archiverepository rules. In a real world scenario, you would likely want to be more selective and granular by only parsing the files needed by the build and splitting them by action, such as compilation, linking, and so on. For the sake of simplicity, this tutorial omits this step. -
Run the build again. Bazel throws the following error:
"execvp(toolchain/emcc.sh, 0x12bd0e0)": No such file or directoryYou now need to make Bazel aware of the artifacts you added in the previous step. In particular, the
emcc.shscript must also be explicitly listed as a dependency of the correspondingcc_toolchainrule. Modify thetoolchain/BUILDfile to look as follows:package(default_visibility = ['//visibility:public']) cc_toolchain_suite( name = "emscripten", toolchains = { "asmjs": ":asmjs_toolchain", "asmjs|emscripten": ":asmjs_toolchain", }, ) filegroup(name = "empty") filegroup( name = "all", srcs = [ "emcc.sh", "@emscripten_toolchain//:all", "@emscripten_clang//:all" ], ) cc_toolchain( name = "asmjs_toolchain", toolchain_identifier = "asmjs-toolchain", all_files = ":all", compiler_files = ":all", cpu = "asmjs", dwp_files = ":empty", dynamic_runtime_libs = [":empty"], linker_files = ":all", objcopy_files = ":empty", static_runtime_libs = [":empty"], strip_files = ":empty", supports_param_files = 0, )Congratulations! You are now using the
emscriptentoolchain to build your C++ sample code. The next steps are optional but are included for completeness. -
(Optional) Run the build again. Bazel throws the following error:
ERROR: .../BUILD:1:1: C++ compilation of rule '//:helloworld.js' failed (Exit 1)The next step is to make the toolchain deterministic and hermetic - that is, limit it to only touch files it’s supposed to touch and ensure it doesn’t write temporary data outside the sandbox.
You also need to ensure the toolchain does not assume the existence of your home directory with its configuration files and that it does not depend on unspecified environment variables.
For our example project, make the following modifications to the
toolchain/BUILDfile:filegroup( name = "all", srcs = [ "emcc.sh", "@emscripten_toolchain//:all", "@emscripten_clang//:all", ":emscripten_cache_content" ], ) filegroup( name = "emscripten_cache_content", srcs = glob(["emscripten_cache/**/*"]), )Since
emscriptencaches standard library files, you can save time by not compilingstdlibfor every action and also prevent it from storing temporary data in random place, check in the precompiled bitcode files into thetoolchain/emscript_cache directory. You can create them by calling the following from theemscripten_clangrepository (or letemscriptencreate them in~/.emscripten_cache):embuilder.py build dlmalloc libcxx libc gl libcxxabi libcxx_noexcept wasm-libcCopy those files to
toolchain/emscripten_cache. Modify yourtoolchain/BUILDfile to look as follows:filegroup( name = "all", srcs = [ "emcc.sh", "@emscripten_toolchain//:all", "@emscripten_clang//:all", ":emscripten_cache_content" ], ) filegroup( name = "emscripten_cache_content", srcs = glob(["emscripten_cache/**/*"]), )Also update the
emcc.shscript to look as follows:#!/bin/bash set -euo pipefail export LLVM_ROOT='external/emscripten_clang' export EMSCRIPTEN_NATIVE_OPTIMIZER='external/emscripten_clang/optimizer' export BINARYEN_ROOT='external/emscripten_clang/' export NODE_JS='' export EMSCRIPTEN_ROOT='external/emscripten_toolchain' export SPIDERMONKEY_ENGINE='' export EM_EXCLUSIVE_CACHE_ACCESS=1 export EMCC_SKIP_SANITY_CHECK=1 export EMCC_WASM_BACKEND=0 mkdir -p "tmp/emscripten_cache" export EM_CACHE="tmp/emscripten_cache" export TEMP_DIR="tmp" # Prepare the cache content so emscripten doesn't keep rebuilding it cp -r toolchain/emscripten_cache/* tmp/emscripten_cache # Run emscripten to compile and link python external/emscripten_toolchain/emcc.py "$@" # Remove the first line of .d file find . -name "*.d" -exec sed -i '2d' {} \;Bazel can now properly compile the sample C++ code in
helloworld.cc. -
(Optional) Run the build again. Bazel throws the following error:
..../BUILD:1:1: undeclared inclusion(s) in rule '//:helloworld.js': this rule is missing dependency declarations for the following files included by 'helloworld.cc': '.../external/emscripten_toolchain/system/include/libcxx/stdio.h' '.../external/emscripten_toolchain/system/include/libcxx/__config' '.../external/emscripten_toolchain/system/include/libc/stdio.h' '.../external/emscripten_toolchain/system/include/libc/features.h' '.../external/emscripten_toolchain/system/include/libc/bits/alltypes.h'At this point you have successfully compiled the example C++ code. The error above occurs because Bazel uses a
.dfile produced by the compiler to verify that all includes have been declared and to prune action inputs.In the
.dfile, Bazel discovered that our source code references system headers that have not been explicitly declared in theBUILDfile. This in and of itself is not a problem and you can easily fix this by adding the target folders as-isystemdirectories to the toolchain definition in theCROSSTOOLfile as follows:compiler_flag: "-isystem" compiler_flag: "external/emscripten_toolchain/system/include/libcxx" compiler_flag: "-isystem" compiler_flag: "external/emscripten_toolchain/system/include/libc" -
(Optional) Run the build again. With this final change, the build now completes error-free.