[llvm-bugs] [Bug 34610] New: Regalloc failure due to attempted reload of full register pair where only subreg is needed

via llvm-bugs llvm-bugs at lists.llvm.org
Thu Sep 14 13:16:19 PDT 2017


https://bugs.llvm.org/show_bug.cgi?id=34610

            Bug ID: 34610
           Summary: Regalloc failure due to attempted reload of full
                    register pair where only subreg is needed
           Product: libraries
           Version: trunk
          Hardware: Other
                OS: Linux
            Status: NEW
          Severity: normal
          Priority: P
         Component: Register Allocator
          Assignee: unassignedbugs at nondot.org
          Reporter: uweigand at de.ibm.com
                CC: llvm-bugs at lists.llvm.org, matze at braunis.de,
                    paulsson at linux.vnet.ibm.com, qcolombet at apple.com

Building the following test case with llc -mtriple=s390x-linux-gnu result in:
LLVM ERROR: ran out of registers during register allocation

define void @test(i64 %dividend, i64 %divisor) {
  %rem = urem i64 %dividend, %divisor
  call void asm sideeffect "",
"{r0},{r1},{r2},{r3},{r4},{r5},{r6},{r7},{r8},{r9},{r10},{r11},{r12},{r13},{r14}"(i64
0, i64 0, i64 0, i64 0, i64 0, i64 0, i64 0, i64 0, i64 0, i64 0, i
64 0, i64 0, i64 0, i64 0, i64 %rem)
  ret void
}

The register allocator sees the following code:

%vreg3<def,tied1> = DLGR %vreg3<tied0>, %vreg1; GR128Bit:%vreg3 GR64Bit:%vreg1
%R0D<def> = LGHI 0
%R1D<def> = LGHI 0
%R2D<def> = LGHI 0
%R3D<def> = LGHI 0
%R4D<def> = LGHI 0
%R5D<def> = LGHI 0
%R6D<def> = LGHI 0
%R7D<def> = LGHI 0
%R8D<def> = LGHI 0
%R9D<def> = LGHI 0
%R10D<def> = LGHI 0
%R11D<def> = LGHI 0
%R12D<def> = LGHI 0
%R13D<def> = LGHI 0
%R14D<def> = COPY %vreg3:subreg_h64; GR128Bit:%vreg3
INLINEASM <es:> [sideeffect] [attdialect], $0:[reguse], %R0D<kill>,
$1:[reguse], %R1D<kill>, $2:[reguse], %R2D, $3:[reguse], %R3D, $4:[reguse],
%R4D<kill>, $5:[reguse], %R5D<kill>, $6:[reguse], %R6D<kill>, $7:[reguse],
%R7D<kill>, $8:[reguse], %R8D<kill>, $9:[reguse], %R9D<kill>, $10:[reguse],
%R10D<kill>, $11:[reguse], %R11D<kill>, $12:[reguse], %R12D<kill>,
$13:[reguse], %R13D<kill>, $14:[reguse], %R14D<kill>

Note that the DLGR instruction uses a 128-bit register pair (a register with an
even number concatenated with the immediately succeeding register) as input,
holding the 128-bit dividend, and also as output, where the two registers of
the pair hold the 64-bit quotient and remainder values respectively.  These
pairs are modeled via the GR128Bit register class in SystemZ.  (The test case
does not use the quotient at all, but it will use the remainder as input to the
inline asm.)

What happens is that the (greedy) register allocator sees that GR128Bit:%vreg3
is used in both the DLGR and the COPY, and notices that no register pair can be
allocated for that whole range, as registers 0..13 are clobbered in between and
14/15 cannot be used as a pair since r15 is reserved (the stack pointer).

So the greedy allocator decides that %vreg3 needs to be spilled, and tries to
insert a reload immediately before the COPY.  However, it still reloads the
full 128-bit value, and therefore needs a register pair to temporarily hold
that reloaded value -- but again, at the point of the COPY, no register pair is
available, since registers 0..13 are live (used by the inline asm) and 14/15
cannot be used.

At this point, the allocator gives up and the error occurs.

However, looking at the actual code, it would be trivial to allocate registers,
since only the high half of %vreg3 is needed, which would fit into a single
register, in particular r14.  Unfortunately, it seems the register allocator
somehow never even considers subregs at this point.

Now I'm wondering how this is supposed to be handled.  Is this a deficiency in
the common parts of the register allocator?  Should e.g. live range splitting
split %vreg3 into one range where its full 128 bits are used, and then a second
range where only the high 64 bits are used?

Or is this a problem in the target back-end code?  For example, I notice that
there is a TII.isSubregFoldable hook, which we currently do not define -- would
this help?  (OTOH, spilling shouldn't actually be necessary at all here ...)

Any help or suggestions from register allocation experts would be welcome :-)

(Note that this particular test using the inline asm is a bit artificial, but
we actually saw this error in a large real-world test case which we cannot
share publicly as-is, but which ran into the same problem.  Instead of the
inline asm making all other registers impossible to use, they all ended up
impossible to use due to a variety of other factors caused by the surrounding
code and prior decisions.)

-- 
You are receiving this mail because:
You are on the CC list for the bug.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-bugs/attachments/20170914/49b628ab/attachment.html>


More information about the llvm-bugs mailing list