[llvm-bugs] [Bug 41864] New: Register Coalescing: missed optimization opportunity

via llvm-bugs llvm-bugs at lists.llvm.org
Mon May 13 12:35:42 PDT 2019


https://bugs.llvm.org/show_bug.cgi?id=41864

            Bug ID: 41864
           Summary: Register Coalescing: missed optimization opportunity
           Product: libraries
           Version: 8.0
          Hardware: PC
                OS: Linux
            Status: NEW
          Severity: enhancement
          Priority: P
         Component: Common Code Generator Code
          Assignee: unassignedbugs at nondot.org
          Reporter: mgudim at gmail.com
                CC: llvm-bugs at lists.llvm.org

In the course of our work, we came across an example which (as we believe)
proves that it is profitable to add a heuristics to decide the order in which
RegisterCoalescer looks at copies in a basic block.

We have the following pattern generated from an innermost loop with one basic
block body (unfortunately, we cannot post the exact IR):

# *** IR Dump After Live Variable Analysis ***:

BB#24: derived from LLVM BB %for.body181
  %vreg381<def> = PHI %vreg361, <BB#23>, %vreg378, <BB#24>;
  %vreg382<def> = PHI %vreg361, <BB#23>, %vreg378, <BB#24>;
  ...
  STORE %vreg381, %vreg385, -2; mem:ST2[%269](noalias=!11);
  ...
  %vreg373<def,tied1> = INST %vreg381<tied0>, %vreg387<kill>, %vreg382, 1;
  ...
  %vreg378<def,tied1> = INST %vreg373<kill,tied0>, %vreg370<kill>,
%vreg382<kill>, 0;
  ...
  CONDITIONAL_BRANCH <BB#24>;

# *** IR Dump After Two-Address instruction pass ***:
BB#24: derived from LLVM BB %for.body181
    Predecessors according to CFG: BB#23 BB#24
  %vreg382<def> = COPY %vreg639;
  %vreg381<def> = COPY %vreg639<kill>;
  STORE %vreg381, %vreg385, -2;
  ...
  %vreg373<def> = COPY %vreg381;
  %vreg373<def,tied1> = INST %vreg373<tied0>, %vreg387<kill>, %vreg382, 1;
  ...
  %vreg378<def> = COPY %vreg373<kill>;
  %vreg378<def,tied1> = INST %vreg378<tied0>, %vreg370<kill>, %vreg382<kill>,
0;
  ...
  %vreg639<def> = COPY %vreg378<kill>;
  ...
  CONDITIONAL_BRANCH<BB#24>;

The RegisterCoalescer works through the basic block from the first copy to last
as follows:

Remove %vreg382<def> = COPY %vreg639:
  %vreg381<def> = COPY %vreg639;
  %vreg369<def> = COPY %vreg639;
  ...
  %vreg373<def> = COPY %vreg381;
  %vreg373<def,tied1> = INSTR %vreg373<tied0>, %vreg644, %vreg639, 1;
  ...
  %vreg378<def> = COPY %vreg373;
  %vreg378<def,tied1> = INSTR %vreg378<tied0>, %vreg370, %vreg639, 0;
  ...
  %vreg639<def> = COPY %vreg378;
  ...
  CONDITIONAL_BRANCH<BB#24>;

Remove %vreg381<def> = COPY %vreg639:
BB#24: derived from LLVM BB %for.body181
    Predecessors according to CFG: BB#23 BB#24
  STORE %vreg639, %vreg642, -2;
  ...
  %vreg373<def> = COPY %vreg639;
  %vreg373<def,tied1> = INST %vreg373<tied0>, %vreg644, %vreg373, 1;
  ...
  %vreg378<def> = COPY %vreg373;
  %vreg378<def,tied1> = INST %vreg378<tied0>, %vreg370, %vreg639, 0;
  ...
  %vreg639<def> = COPY %vreg378;
  ...
  COND_BRANCH <BB#24>;

Remove %vreg373<def> = COPY %vreg639: Interference!
Remove %vreg378<def> = COPY %vreg373:
BB#24: derived from LLVM BB %for.body181
  STORE %vreg639, %vreg642, -2; mem:ST2[%269](noalias=!11);
  ...
  %vreg378<def> = COPY %vreg639;
  %vreg378<def,tied1> = INSTR %vreg378<tied0>, %vreg644, %vreg378, 1;
  ...
  %vreg378<def,tied1> = INSTR %vreg378<tied0>, %vreg370, %vreg639, 0;
  ...
  %vreg639<def> = COPY %vreg378;
  ...
  CONDITIONAL_BRANCH<BB#25>

Remove %vreg639<def> = COPY %vreg378: Interference!


So we end up with the following assembly code:

  copy from v0 to v4
  ...
  copy from v4 to v0


The last copy can be removed, as we have verified.


We believe this problem can be fixed in RegisterCoalescer. We hacked the
RegisterCoalescer so that it works through the copies in the problematic basic
block in the reversed order of the original (from last copy to the first). With
the reversed order the redundant copy disappears and correct assembly is
generated. This lets us conclude that there should be some heuristic for
ordering input copies or perhaps delaying the decision of coalescing.

We would like to hear community opinion on how to approach this problem. In
particular, should this kind of optimization be inside RegisterCoalescer? If
yes, has any work been done in this direction?

Ideally, we want to find a proper solution and contribute it to the llvm
project.

Thank you and looking forward to your suggestions,

Mikhail Gudim

-- 
You are receiving this mail because:
You are on the CC list for the bug.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-bugs/attachments/20190513/816d5e14/attachment.html>


More information about the llvm-bugs mailing list