<div dir="ltr">Nifty! Look forward to seeing how that shakes out/is built upon/etc.</div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Tue, Nov 2, 2021 at 10:23 AM Amara Emerson via llvm-dev <<a href="mailto:llvm-dev@lists.llvm.org">llvm-dev@lists.llvm.org</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">Hi all,<br>
<br>
I’d just like to draw attention to some patch reviews for a new tool I’m proposing, llvm-bisectd: <a href="https://reviews.llvm.org/D113030" rel="noreferrer" target="_blank">https://reviews.llvm.org/D113030</a><br>
<br>
There’s documentation in the Bisection.md file in the patch, which I’ll just paste here for convenience.<br>
<br>
# Bisection with llvm-bisectd<br>
<br>
## Introduction<br>
<br>
The `llvm-bisectd` tool allows LLVM developers to rapidly bisect miscompiles in<br>
clang or other tools running in parallel. This document explains how the tool<br>
works and how to leverage it for bisecting your own specific issues.<br>
<br>
Bisection as a general debugging technique can be done in multiple ways. We can<br>
bisect across the *time* dimension, which usually means that we're bisecting<br>
commits made to LLVM. We could instead bisect across the dimension of the LLVM<br>
codebase itself, disabling some optimizations and leaving others enabled, to<br>
narrow down the configuration that reproduces the issue. We can also bisect in<br>
the dimension of the target program being compiled, e.g. compiling some parts<br>
with a known good configuration to narrow down the problematic location in the<br>
program. The `llvm-bisectd` tool is intended to help with this last approach to<br>
debugging: finding the place where a bug is introduced. It does so with the aim<br>
of being minimally intrusive to the build system of the target program.<br>
<br>
## High level design<br>
<br>
The bisection process with `llvm-bisectd` uses a client/server model, where all<br>
the state about the bisection is maintained by the `llvm-bisectd` daemon. The<br>
compilation tools (e.g. clang) send requests and get responses back telling<br>
them what to do. As a developer, debugging using this methodology is intended<br>
to be simple, with the daemon taking care of most of the complexity.<br>
<br>
### Bisection keys<br>
<br>
This process relies on a user-defined key that's used to represent a particular<br>
action being done at a unique place in the target program's build. The key is a<br>
string to allow the most flexibility of data representation. `llvm-bisectd`<br>
doesn't care what the meaning of the key is, as long as has the following<br>
properties:<br>
1. The key maps onto a specific place in the source program in a stable manner.<br>
Even if the software is being built with multiple compilers running<br>
concurrently, the key should not be affected.<br>
2. Between one build of the target software and the next (clean) build, the <br>
same set of keys should be generated exactly.<br>
<br>
For our example of bisecting a novel optimization pass, a good choice of key<br>
would be the module + function name of the target program being compiled. The<br>
function name meets requirement 1. because each module + function string refers<br>
to a unique place in the target program. (A module may not have two functions<br>
with the same symbol name). The inclusion of the module name in the key helps<br>
to disambiguate two local linkage functions with the same name in two different<br>
translation units. The key also satisfies requirement 2. because the function<br>
names are static between one build and the next (e.g. no random auto-generation<br>
going on).<br>
<br>
## Bisection workflow<br>
<br>
The bisection process has two stages. The first is called the *learning* stage,<br>
and the second is the main *bisection* stage. The purpose of the learning<br>
stage is for the bisection daemon to *learn* about all the keys that will be<br>
bisected through during each bisection round.<br>
<br>
The first thing that needs to be done is that `llvm-bisectd` needs to be<br>
started as a daemon.<br>
<br>
```console<br>
$ llvm-bisectd<br>
bisectd > _<br>
```<br>
<br>
On start, `llvm-bisectd` initializes into the learning phase, so nothing else<br>
needs to be done.<br>
<br>
Then, the software project being debugged is built with the client tools like<br>
clang having the bisection mode enabled. This can be a compiler flag or some<br>
other mechanism. For example, to bisect GlobalISel across target functions,<br>
we can pass `-mllvm -gisel-bisect-selection` to clang.<br>
<br>
During the first build of the project, the client tools are sending a bisection<br>
request to `llvm-bisectd` for each key. `llvm-bisectd` in the learning phase<br>
just replies to the clients with the answer "YES". In the background, it's<br>
storing each unique key it receives into a vector for later.<br>
<br>
### Bisection phase<br>
<br>
After the first build is done, the learning phase is over, and `llvm-bisectd`<br>
should know about all the keys that will be requested in future builds.<br>
We can start the bisection phase now by using the `start-bisection` command in<br>
the `llvm-bisectd` command interpreter.<br>
<br>
```<br>
bisectd > start-bisect<br>
Starting bisection with 17306 total keys to search<br>
bisectd > _<br>
```<br>
<br>
We're now in the bisection phase. Now, we perform the following actions in a<br>
repeatedly until `llvm-bisectd` terminates with an answer.<br>
1. Do a clean build of the project (with the bisection flags as before)<br>
2. Test the resulting build to see if it still exhibits the bug.<br>
3. If the bug remains, then we type the command `bad` into the `llvm-bisectd`<br>
interpreter. If the bug has disappeared, we type the `good` command instead.<br>
<br>
And that's it! Eventually the bisection will finish and `llvm-bisectd` will<br>
print the *key* that, when enabled, triggers the bug.<br>
<br>
``` console<br>
Bisection completed. Failing key was: /work/testing/llvm-test-suite/CTMark/tramp3d-v4/tramp3d-v4.cpp _ZN17MultiArgEvaluatorI16MainEvaluatorTagE13createIterateI9MultiArg3I5FieldI22UniformRectilinearMeshI10MeshTraitsILi3Ed21UniformRectilinearTag12CartesianTagLi3EEEd10BrickViewUESC_SC_EN4Adv51Z13MomentumfluxZILi3EEELi3E15EvaluateLocLoopISH_Li3EEEEvRKT_RKT0_RK8IntervalIXT1_EER14ScalarCodeInfoIXT1_EXsrSK_4sizeEERKT2_<br>
Exiting...<br>
```<br>
<br>
## Adding bisection support in clients<br>
<br>
Adding support for bisecting a new type of action is simple. The client only<br>
needs to generate a key at the point where bisection is needed, and then use<br>
client utilities in `lib/Support/RemoteBisectorClient.cpp` to talk to the<br>
daemon. For example, if the bisection is to done for a `FunctionPass`<br>
optimization, then one place to add the code would be to the `runOnFunction()`<br>
method, using the function name as a key.<br>
<br>
```C++<br>
bool runOnFunction(Function &F) {<br>
// ...<br>
if (EnableBisectForNewOptimization) {<br>
std::string Key = F.getParent()->getSourceFileName() + " "<br>
+ F.getName().str();<br>
RemoteBisectClient BisectClient;<br>
if (!BisectClient.shouldPerformAction(Key))<br>
return false; // Bisector daemon told us to skip this action.<br>
}<br>
// Continue with the optimization<br>
// ...<br>
}<br>
```<br>
<br>
Thanks,<br>
Amara<br>
_______________________________________________<br>
LLVM Developers mailing list<br>
<a href="mailto:llvm-dev@lists.llvm.org" target="_blank">llvm-dev@lists.llvm.org</a><br>
<a href="https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev" rel="noreferrer" target="_blank">https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev</a><br>
</blockquote></div>