[llvm-dev] Finding label of basic block where a conditional branch merges

Cranmer, Joshua via llvm-dev llvm-dev at lists.llvm.org
Tue Jan 29 13:16:13 PST 2019

The answer to this question depends a lot as to what you mean by "where a conditional branch merges."

The immediate post-dominator of a basic block is the first point where all possible paths from that basic block must execute. When your basic block in question is the immediate dominator (or indeed, just any dominator), then that means that you can say that you are the head of an if statement and the immediate post-dominator is the tail of the same if statement.

However, when gotos, or goto-like statements (break, continue, return, throw, etc.) are involved, then the tail of an if statement is no longer a postdominator. Note that such statements can be introduced by optimizations such as jump threading.

if (...) {
  return true;
} else {
// This node is no longer a postdominator because of the return.

The question here is what you want to do. If your goal is akin to decompilation, where you want to "ignore" these branches for the purposes of making simple graphs, then the approach is going to be a heuristic-filled approach for which there are no obvious answers. If your goal is instead mere correctness--you want to undo something you did before the if statement--then you're going to have to adapt your algorithm to account for the possibility of these kinds of CFGs.

Also note that LLVM does not guarantee the existence of value names--clang, in release builds, does not generate any value names whatsoever, for blocks or values. Any approach relying on clang generating these names is not going to work in such scenarios.

-----Original Message-----
From: llvm-dev [mailto:llvm-dev-bounces at lists.llvm.org] On Behalf Of Sebastian Roland via llvm-dev
Sent: Tuesday, January 29, 2019 15:23
To: llvm-dev at lists.llvm.org
Subject: [llvm-dev] Finding label of basic block where a conditional branch merges


is there any standard way to figure out the label of the basic block where a conditional branch merges?

If we have a branch statement without an else part this is straightforward:

br i1 %cmp, label %if.then, label %if.end, !dbg !24

We only need to check the operand names whether they contain "end". 
However things are getting
more complex when there is an else branch and in particular when we have nested branches.

Take for example the following C code:

if (x == NULL || y == NULL) {
     do {
        // ...
     } while (0);

This yields the following IR:

br i1 %cmp, label %if.then, label %lor.lhs.false, !dbg !31

lor.lhs.false:                                    ; preds = %entry
   %1 = load %struct.s1*, %struct.s1** %s1.addr, align 8, !dbg !32
   %cmp1 = icmp eq %struct.s1* %1, null, !dbg !33
   br i1 %cmp1, label %if.then, label %if.end, !dbg !34

if.then:                                          ; preds = %lor.lhs.false, %entry
   br label %do.body, !dbg !35, !llvm.loop !37

do.body:                                          ; preds = %if.then
   br label %do.end, !dbg !39

do.end:                                           ; preds = %do.body
   br label %if.end, !dbg !41

if.end:                                           ; preds = %do.end, %lor.lhs.false
   call void @llvm.dbg.declare(metadata i32* %a, metadata !42, metadata !24), !dbg !43

The question now is how to obtain "if.end" given br i1 %cmp, label %if.then, label %lor.lhs.false, !dbg !31.

My current algorithm basically works for a lot of cases but fails here. 
What I am doing is to push the branch instruction to a stack and then always use the left most label of the next terminator instruction to move forward. Whenever I encounter a basic block that contains an "end" 
in its label I pop from the stack, whenever there is a conditional branch I push. If the stack is empty I have reached the correct basic block. For the example given it would however yield "do.end" which is not correct.

I am wondering whether there is any known graph algorithm I could use to solve the problem or even better something that is already implemented within LLVM?


More information about the llvm-dev mailing list