[llvm-bugs] [Bug 37412] New: [RegAllocGreedy] Folding memory operand alone gives many more uncoalesced copies

via llvm-bugs llvm-bugs at lists.llvm.org
Fri May 11 00:05:58 PDT 2018


https://bugs.llvm.org/show_bug.cgi?id=37412

            Bug ID: 37412
           Summary: [RegAllocGreedy]  Folding memory operand alone gives
                    many more uncoalesced copies
           Product: libraries
           Version: trunk
          Hardware: PC
                OS: Linux
            Status: NEW
          Severity: enhancement
          Priority: P
         Component: Common Code Generator Code
          Assignee: unassignedbugs at nondot.org
          Reporter: paulsson at linux.vnet.ibm.com
                CC: llvm-bugs at lists.llvm.org

Recently it was discovered that an opcode (AHIMux) was not specifically handled
the same way as AHI is in SystemZInstrInfo::foldMemoryOperandImpl(), which it
should be since AHIMux (addition of immediate) also can be replaced with ASI
(add immediate to memory), in case the register operand is spilled. AHIMux is
sort of a sibling opcode to AHI that can use either high or low 32-bit
registers.

This was a simple to fix by just adding a check for that opcode in this method.
However, when doing so I found that even though my test case now had a few ASIs
instead of reload-add-save (which was nice), there were now +20 more COPYs that
were not colesced! That was quite considerable, given the quite small test
case.

This happened with the very test case that was made for the ASI, which was
intended to use enough registers to cause the spilling, so that ASI would be
used. Basically it has 32 volatile loads with a conditional branch past a block
that adds an immediate to the loaded values, and in the final block the
resulting value is stored:

entry:
  %val0 = load volatile i32 , i32 *%ptr
  ...
  %val31 = load volatile i32 , i32 *%ptr
  %test = icmp ne i32 %sel, 0
  br i1 %test, label %add, label %store

add:
  %add0 = add i32 %val0, 127
  ...
  %add31 = add i32 %val31, 127
  br label %store

store:
  %new0 = phi i32 [ %val0, %entry ], [ %add0, %add ]
  ...
  %new31 = phi i32 [ %val31, %entry ], [ %add31, %add ]
  store volatile i32 %new0, i32 *%ptr
  ...
  store volatile i32 %new31, i32 *%ptr
  ret void

After "Simple Register Coalescing", the MachineFunction (note: no COPYs) looks
pretty much like

bb.0.entry:
%98:grx32bit = LMux %96:addr64bit, 0, $noreg :: (volatile load 4 from %ir.ptr)
...
%129:grx32bit = LMux %96:addr64bit, 0, $noreg :: (volatile load 4 from %ir.ptr)
CHIMux %97:gr32bit, 0, implicit-def $cc
BRC 14, 6, %bb.1, implicit killed $cc
J %bb.2

bb.1:
%98:grx32bit = AHIMux %98:grx32bit, 127, implicit-def dead $cc
...
%129:grx32bit = AHIMux %129:grx32bit, 127, implicit-def dead $cc

bb.2:
STMux %98:grx32bit, %96:addr64bit, 0, $noreg :: (volatile store 4 into %ir.ptr)
...
STMux %129:grx32bit, %96:addr64bit, 0, $noreg :: (volatile store 4 into
%ir.ptr)
Return

Greedy will now run out of registers and split live ranges and insert COPYs
like (without ASI patch):

bb.0.entry:
%132:grx32bit = LMux %96:addr64bit, 0, $noreg :: (volatile load 4 from %ir.ptr)
...
%221:grx32bit = LMux %96:addr64bit, 0, $noreg :: (volatile load 4 from %ir.pt
CHIMux %97:gr32bit, 0, implicit-def $cc
BRC 14, 6, %bb.1, implicit killed $cc
J %bb.1

bb.3:
%137:grx32bit = COPY %136:grx32bit
... (23 COPYs)
%203:grx32bit = COPY %202:grx32bit
J %bb.2

bb.1:
%137:grx32bit = COPY %136:grx32bi
...
%203:grx32bit = COPY %202:grx32bit
%137:grx32bit = AHIMux %137:grx32bit, 127, implicit-def dead $cc
...
%222:grx32bit = AHIMux %222:grx32bit, 127, implicit-def dead $cc

Note that the COPYs in bb.1 here lies before the AHIMuxes. For some reason,
when ASIs are used in bb.1, the COPYs now instead are inserted *after* the
AHIMuxes:

bb.1.add:
ASI %stack.0, 0, 127, implicit-def dead $cc :: (store 4 into %stack.0)
%135:grx32bit = AHIMux %135:grx32bit, 127, implicit-def dead $cc
... (also 3 more ASIs)
%216:grx32bit = AHIMux %216:grx32bit, 127, implicit-def dead $cc
%220:grx32bit = COPY %219:grx32bit
...
%136:grx32bit = COPY %135:grx32bit

Unfortunately, after "Virtual Register Rewriter"all those COPYs are still
there, while without the ASIs they are gone!

I find this odd and feel I would like some advice on what to do with this. Is
there a reason that the COPYs are now inserted in a different position, or is
this just a bug?

Attached is a side-to-side diff with '-print-after-all -debug-only=regalloc'
(master to the left, ASIs to the right)

-- 
You are receiving this mail because:
You are on the CC list for the bug.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-bugs/attachments/20180511/d9d5923f/attachment.html>


More information about the llvm-bugs mailing list