[llvm-dev] Loop branching inefficiencies in Backend output

Sun Jul 9 16:36:00 PDT 2017

Hi,

I am working on a custom backend, and I am trying to figure out how to deal
with some branching inefficiencies in my output code, and the best way to
fix it.

So, let's say I am compiling a small function that takes the sum of an
array.
int loop(int* array, int n) {
  int ret = 0;
  for (int i = 0; i < n; i++) {
    ret += array[i];
  }
  return ret;
}

The problem I am having is that in the generated code, we got something
along these lines.

loop:
    <loop-code>
    branch cond exit-block
    branch loop
exit-block:
    <exit-block code>

Now, due to the basic block placement, that could be simplified to the
following, removing an extra branch instruction.
loop:
    <loop-code>
    branch !cond loop
exit-block:
    <exit-block code>

So, I decided to investigate to see if this problem occurs in other
backends (basically to see if I am missing some implementation in my
backend). I found that you can see this issue with the same code for the
NVPTX backend, but not for the x86 or ARM backends.

When I looked at the debug output for the x86 backend though, I can't
figure out how they realize to get rid of that branch. Up to the last point
that MachineInstrs are printed (the rewrite virtual registers pass), the
MachineInstrs still show the two jump paradigm. But in the final output, it
somehow becomes one branch case.

I know one solution is for me to add a pass after basic block placement to
change these instructions, but I am wondering if there is a recommended way
to do this. Especially as this seems like a common problem, where an
existing generic pass would have some interface a backend should support to
allow branch re-writes to create the single branch case.

Thanks,
-Dilan
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20170709/dfe6d347/attachment.html>