<div dir="ltr">Nifty! Look forward to seeing how that shakes out/is built upon/etc.</div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Tue, Nov 2, 2021 at 10:23 AM Amara Emerson via llvm-dev <<a href="mailto:llvm-dev@lists.llvm.org">llvm-dev@lists.llvm.org</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">Hi all,<br>

<br>

I’d just like to draw attention to some patch reviews for a new tool I’m proposing, llvm-bisectd: <a href="https://reviews.llvm.org/D113030" rel="noreferrer" target="_blank">https://reviews.llvm.org/D113030</a><br>

<br>

There’s documentation in the Bisection.md file in the patch, which I’ll just paste here for convenience.<br>

<br>

# Bisection with llvm-bisectd<br>

<br>

## Introduction<br>

<br>

The `llvm-bisectd` tool allows LLVM developers to rapidly bisect miscompiles in<br>

clang or other tools running in parallel. This document explains how the tool<br>

works and how to leverage it for bisecting your own specific issues.<br>

<br>

Bisection as a general debugging technique can be done in multiple ways. We can<br>

bisect across the *time* dimension, which usually means that we're bisecting<br>

commits made to LLVM. We could instead bisect across the dimension of the LLVM<br>

codebase itself, disabling some optimizations and leaving others enabled, to<br>

narrow down the configuration that reproduces the issue. We can also bisect in<br>

the dimension of the target program being compiled, e.g. compiling some parts<br>

with a known good configuration to narrow down the problematic location in the<br>

program. The `llvm-bisectd` tool is intended to help with this last approach to<br>

debugging: finding the place where a bug is introduced. It does so with the aim<br>

of being minimally intrusive to the build system of the target program.<br>

<br>

## High level design<br>

<br>

The bisection process with `llvm-bisectd` uses a client/server model, where all<br>

the state about the bisection is maintained by the `llvm-bisectd` daemon. The<br>

compilation tools (e.g. clang) send requests and get responses back telling<br>

them what to do. As a developer, debugging using this methodology is intended<br>

to be simple, with the daemon taking care of most of the complexity.<br>

<br>

### Bisection keys<br>

<br>

This process relies on a user-defined key that's used to represent a particular<br>

action being done at a unique place in the target program's build. The key is a<br>

string to allow the most flexibility of data representation. `llvm-bisectd`<br>

doesn't care what the meaning of the key is, as long as has the following<br>

properties:<br>

 1. The key maps onto a specific place in the source program in a stable manner.<br>

    Even if the software is being built with multiple compilers running<br>

    concurrently, the key should not be affected.<br>

 2. Between one build of the target software and the next (clean) build, the <br>

    same set of keys should be generated exactly.<br>

<br>

For our example of bisecting a novel optimization pass, a good choice of key<br>

would be the module + function name of the target program being compiled. The<br>

function name meets requirement 1. because each module + function string refers<br>

to a unique place in the target program. (A module may not have two functions<br>

with the same symbol name). The inclusion of the module name in the key helps<br>

to disambiguate two local linkage functions with the same name in two different<br>

translation units. The key also satisfies requirement 2. because the function<br>

names are static between one build and the next (e.g. no random auto-generation<br>

going on).<br>

<br>

## Bisection workflow<br>

<br>

The bisection process has two stages. The first is called the *learning* stage,<br>

and the second is the main *bisection* stage. The purpose of the learning<br>

stage is for the bisection daemon to *learn* about all the keys that will be<br>

bisected through during each bisection round.<br>

<br>

The first thing that needs to be done is that `llvm-bisectd` needs to be<br>

started as a daemon.<br>

<br>

```console<br>

$ llvm-bisectd<br>

bisectd > _<br>

```<br>

<br>

On start, `llvm-bisectd` initializes into the learning phase, so nothing else<br>

needs to be done.<br>

<br>

Then, the software project being debugged is built with the client tools like<br>

clang having the bisection mode enabled. This can be a compiler flag or some<br>

other mechanism. For example, to bisect GlobalISel across target functions,<br>

we can pass `-mllvm -gisel-bisect-selection` to clang.<br>

<br>

During the first build of the project, the client tools are sending a bisection<br>

request to `llvm-bisectd` for each key. `llvm-bisectd` in the learning phase<br>

just replies to the clients with the answer "YES". In the background, it's<br>

storing each unique key it receives into a vector for later.<br>

<br>

### Bisection phase<br>

<br>

After the first build is done, the learning phase is over, and `llvm-bisectd`<br>

should know about all the keys that will be requested in future builds.<br>

We can start the bisection phase now by using the `start-bisection` command in<br>

the `llvm-bisectd` command interpreter.<br>

<br>

```<br>

bisectd > start-bisect<br>

Starting bisection with 17306 total keys to search<br>

bisectd > _<br>

```<br>

<br>

We're now in the bisection phase. Now, we perform the following actions in a<br>

repeatedly until `llvm-bisectd` terminates with an answer.<br>

 1. Do a clean build of the project (with the bisection flags as before)<br>

 2. Test the resulting build to see if it still exhibits the bug.<br>

 3. If the bug remains, then we type the command `bad` into the `llvm-bisectd`<br>

    interpreter. If the bug has disappeared, we type the `good` command instead.<br>

<br>

And that's it! Eventually the bisection will finish and `llvm-bisectd` will<br>

print the *key* that, when enabled, triggers the bug.<br>

<br>

``` console<br>

Bisection completed. Failing key was: /work/testing/llvm-test-suite/CTMark/tramp3d-v4/tramp3d-v4.cpp _ZN17MultiArgEvaluatorI16MainEvaluatorTagE13createIterateI9MultiArg3I5FieldI22UniformRectilinearMeshI10MeshTraitsILi3Ed21UniformRectilinearTag12CartesianTagLi3EEEd10BrickViewUESC_SC_EN4Adv51Z13MomentumfluxZILi3EEELi3E15EvaluateLocLoopISH_Li3EEEEvRKT_RKT0_RK8IntervalIXT1_EER14ScalarCodeInfoIXT1_EXsrSK_4sizeEERKT2_<br>

Exiting...<br>

```<br>

<br>

## Adding bisection support in clients<br>

<br>

Adding support for bisecting a new type of action is simple. The client only<br>

needs to generate a key at the point where bisection is needed, and then use<br>

client utilities in `lib/Support/RemoteBisectorClient.cpp` to talk to the<br>

daemon. For example, if the bisection is to done for a `FunctionPass`<br>

optimization, then one place to add the code would be to the `runOnFunction()`<br>

method, using the function name as a key.<br>

<br>

```C++<br>

bool runOnFunction(Function &F) {<br>

  // ...<br>

  if (EnableBisectForNewOptimization) {<br>

    std::string Key = F.getParent()->getSourceFileName() + " "<br>

                        + F.getName().str();<br>

    RemoteBisectClient BisectClient;<br>

    if (!BisectClient.shouldPerformAction(Key))<br>

      return false; // Bisector daemon told us to skip this action.<br>

  }<br>

  // Continue with the optimization<br>

  // ...<br>

}<br>

```<br>

<br>

Thanks,<br>

Amara<br>

_______________________________________________<br>

LLVM Developers mailing list<br>

<a href="mailto:llvm-dev@lists.llvm.org" target="_blank">llvm-dev@lists.llvm.org</a><br>

<a href="https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev" rel="noreferrer" target="_blank">https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev</a><br>

</blockquote></div>