[llvm-dev] [RFC] llvm-bisectd: a bisection daemon for supporting bisection with parallel builds

David Blaikie via llvm-dev llvm-dev at lists.llvm.org
Tue Nov 2 10:40:10 PDT 2021


Nifty! Look forward to seeing how that shakes out/is built upon/etc.

On Tue, Nov 2, 2021 at 10:23 AM Amara Emerson via llvm-dev <
llvm-dev at lists.llvm.org> wrote:

> Hi all,
>
> I’d just like to draw attention to some patch reviews for a new tool I’m
> proposing, llvm-bisectd: https://reviews.llvm.org/D113030
>
> There’s documentation in the Bisection.md file in the patch, which I’ll
> just paste here for convenience.
>
> # Bisection with llvm-bisectd
>
> ## Introduction
>
> The `llvm-bisectd` tool allows LLVM developers to rapidly bisect
> miscompiles in
> clang or other tools running in parallel. This document explains how the
> tool
> works and how to leverage it for bisecting your own specific issues.
>
> Bisection as a general debugging technique can be done in multiple ways.
> We can
> bisect across the *time* dimension, which usually means that we're
> bisecting
> commits made to LLVM. We could instead bisect across the dimension of the
> LLVM
> codebase itself, disabling some optimizations and leaving others enabled,
> to
> narrow down the configuration that reproduces the issue. We can also
> bisect in
> the dimension of the target program being compiled, e.g. compiling some
> parts
> with a known good configuration to narrow down the problematic location in
> the
> program. The `llvm-bisectd` tool is intended to help with this last
> approach to
> debugging: finding the place where a bug is introduced. It does so with
> the aim
> of being minimally intrusive to the build system of the target program.
>
> ## High level design
>
> The bisection process with `llvm-bisectd` uses a client/server model,
> where all
> the state about the bisection is maintained by the `llvm-bisectd` daemon.
> The
> compilation tools (e.g. clang) send requests and get responses back telling
> them what to do. As a developer, debugging using this methodology is
> intended
> to be simple, with the daemon taking care of most of the complexity.
>
> ### Bisection keys
>
> This process relies on a user-defined key that's used to represent a
> particular
> action being done at a unique place in the target program's build. The key
> is a
> string to allow the most flexibility of data representation. `llvm-bisectd`
> doesn't care what the meaning of the key is, as long as has the following
> properties:
>  1. The key maps onto a specific place in the source program in a stable
> manner.
>     Even if the software is being built with multiple compilers running
>     concurrently, the key should not be affected.
>  2. Between one build of the target software and the next (clean) build,
> the
>     same set of keys should be generated exactly.
>
> For our example of bisecting a novel optimization pass, a good choice of
> key
> would be the module + function name of the target program being compiled.
> The
> function name meets requirement 1. because each module + function string
> refers
> to a unique place in the target program. (A module may not have two
> functions
> with the same symbol name). The inclusion of the module name in the key
> helps
> to disambiguate two local linkage functions with the same name in two
> different
> translation units. The key also satisfies requirement 2. because the
> function
> names are static between one build and the next (e.g. no random
> auto-generation
> going on).
>
> ## Bisection workflow
>
> The bisection process has two stages. The first is called the *learning*
> stage,
> and the second is the main *bisection* stage. The purpose of the learning
> stage is for the bisection daemon to *learn* about all the keys that will
> be
> bisected through during each bisection round.
>
> The first thing that needs to be done is that `llvm-bisectd` needs to be
> started as a daemon.
>
> ```console
> $ llvm-bisectd
> bisectd > _
> ```
>
> On start, `llvm-bisectd` initializes into the learning phase, so nothing
> else
> needs to be done.
>
> Then, the software project being debugged is built with the client tools
> like
> clang having the bisection mode enabled. This can be a compiler flag or
> some
> other mechanism. For example, to bisect GlobalISel across target functions,
> we can pass `-mllvm -gisel-bisect-selection` to clang.
>
> During the first build of the project, the client tools are sending a
> bisection
> request to `llvm-bisectd` for each key. `llvm-bisectd` in the learning
> phase
> just replies to the clients with the answer "YES". In the background, it's
> storing each unique key it receives into a vector for later.
>
> ### Bisection phase
>
> After the first build is done, the learning phase is over, and
> `llvm-bisectd`
> should know about all the keys that will be requested in future builds.
> We can start the bisection phase now by using the `start-bisection`
> command in
> the `llvm-bisectd` command interpreter.
>
> ```
> bisectd > start-bisect
> Starting bisection with 17306 total keys to search
> bisectd > _
> ```
>
> We're now in the bisection phase. Now, we perform the following actions in
> a
> repeatedly until `llvm-bisectd` terminates with an answer.
>  1. Do a clean build of the project (with the bisection flags as before)
>  2. Test the resulting build to see if it still exhibits the bug.
>  3. If the bug remains, then we type the command `bad` into the
> `llvm-bisectd`
>     interpreter. If the bug has disappeared, we type the `good` command
> instead.
>
> And that's it! Eventually the bisection will finish and `llvm-bisectd` will
> print the *key* that, when enabled, triggers the bug.
>
> ``` console
> Bisection completed. Failing key was:
> /work/testing/llvm-test-suite/CTMark/tramp3d-v4/tramp3d-v4.cpp
> _ZN17MultiArgEvaluatorI16MainEvaluatorTagE13createIterateI9MultiArg3I5FieldI22UniformRectilinearMeshI10MeshTraitsILi3Ed21UniformRectilinearTag12CartesianTagLi3EEEd10BrickViewUESC_SC_EN4Adv51Z13MomentumfluxZILi3EEELi3E15EvaluateLocLoopISH_Li3EEEEvRKT_RKT0_RK8IntervalIXT1_EER14ScalarCodeInfoIXT1_EXsrSK_4sizeEERKT2_
> Exiting...
> ```
>
> ## Adding bisection support in clients
>
> Adding support for bisecting a new type of action is simple. The client
> only
> needs to generate a key at the point where bisection is needed, and then
> use
> client utilities in `lib/Support/RemoteBisectorClient.cpp` to talk to the
> daemon. For example, if the bisection is to done for a `FunctionPass`
> optimization, then one place to add the code would be to the
> `runOnFunction()`
> method, using the function name as a key.
>
> ```C++
> bool runOnFunction(Function &F) {
>   // ...
>   if (EnableBisectForNewOptimization) {
>     std::string Key = F.getParent()->getSourceFileName() + " "
>                         + F.getName().str();
>     RemoteBisectClient BisectClient;
>     if (!BisectClient.shouldPerformAction(Key))
>       return false; // Bisector daemon told us to skip this action.
>   }
>   // Continue with the optimization
>   // ...
> }
> ```
>
> Thanks,
> Amara
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20211102/c665519a/attachment.html>


More information about the llvm-dev mailing list