[llvm-bugs] [Bug 26775] New: Wrong code generation in aggressive anti-dependency breaker

via llvm-bugs llvm-bugs at lists.llvm.org
Mon Feb 29 09:50:08 PST 2016


https://llvm.org/bugs/show_bug.cgi?id=26775

            Bug ID: 26775
           Summary: Wrong code generation in aggressive anti-dependency
                    breaker
           Product: libraries
           Version: trunk
          Hardware: Other
                OS: Linux
            Status: NEW
          Severity: normal
          Priority: P
         Component: Common Code Generator Code
          Assignee: unassignedbugs at nondot.org
          Reporter: uweigand at de.ibm.com
                CC: llvm-bugs at lists.llvm.org
    Classification: Unclassified

Running the following test case:

target datalayout = "E-m:e-i64:64-n32:64"
target triple = "powerpc64-unknown-linux-gnu"

declare void @func(i8*, i64, i64)

define void @test(i8* %context, i32** %elementArrayPtr, i32 %value) {
entry:
  %cmp = icmp eq i32 %value, 0
  br i1 %cmp, label %lreturn, label %lnext

lnext:
  %elementArray = load i32*, i32** %elementArrayPtr, align 8
  %element = load i32, i32* %elementArray, align 4
  %element.ext = zext i32 %element to i64
  %value.ext = zext i32 %value to i64
  call void @func(i8* %context, i64 %value.ext, i64 %element.ext)
  br label %lreturn

lreturn:
  ret void
}


on powerpc64-linux using the command line: 
  llc -optimize-regalloc=false -regalloc=fast


results in the following incorrect assembler output:

# BB#1:                                 # %lnext
        ld 3, 128(1)                    # 8-byte Folded Reload
        lwz 6, 124(1)                   # 4-byte Folded Reload
                                        # implicit-def: %X12
        ld 4, 0(3)
        ld 3, 136(1)                    # 8-byte Folded Reload
        lwz 5, 0(4)
        mr 4, 6
        clrldi   4, 12, 32
        bl func
        nop

Note how the "lwz 6, 124(1)" is marked as implicit-def: %X12,
which is of course incorrect, and later how the clrldi uses
register 12, which is actually undefined at this point.


Looking at the MI dumps, the invalid use of %X12 is introduced by the post-RA
scheduling pass, presumably via the anti-dependency breaker.  Before the
scheduler, we have:
        %X3<def> = LD 128, %X1; mem:LD8[FixedStack1]
        %X4<def> = LD 0, %X3<kill>; mem:LD8[%elementArrayPtr]
        %X5<def> = LWZ8 0, %X4<kill>; mem:LD4[%elementArray]
        %X4<def> = IMPLICIT_DEF
        %R6<def> = LWZ 124, %X1; mem:LD4[FixedStack2]
        %R4<def> = OR %R6, %R6<kill>
        %X4<def> = RLDICL %X4<kill>, 0, 32
        %X3<def> = LD 136, %X1; mem:LD8[FixedStack0]
        BL8_NOP <ga:@func>
and afterwards we see:
        %X3<def> = LD 128, %X1; mem:LD8[FixedStack1]
        %R6<def> = LWZ 124, %X1; mem:LD4[FixedStack2]
        %X12<def> = IMPLICIT_DEF
        %X4<def> = LD 0, %X3<kill>; mem:LD8[%elementArrayPtr]
        %X3<def> = LD 136, %X1; mem:LD8[FixedStack0]
        %X5<def> = LWZ8 0, %X4<kill>; mem:LD4[%elementArray]
        %R4<def> = OR %R6<kill>, %R6<kill>
        %X4<def> = RLDICL %X12<kill>, 0, 32
        BL8_NOP <ga:@func>

It would appear that the dependency breaker renamed the use-def chain of %X4
from IMPLICIT_DEF to the RLDICL to use %X12 instead, without noticing that
there is a def of %R4 in between (which is a subreg of %X4).

It's not fully clear to me whether use of the "fast" register allocator is
necessary in general to trigger the bug; however, I didn't find any test case
using the default register allocator that would create a register usage pattern
before the post-RA scheduler that shows the issue.

-- 
You are receiving this mail because:
You are on the CC list for the bug.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-bugs/attachments/20160229/1f4857e5/attachment.html>


More information about the llvm-bugs mailing list