[llvm-bugs] [Bug 26775] New: Wrong code generation in aggressive anti-dependency breaker
via llvm-bugs
llvm-bugs at lists.llvm.org
Mon Feb 29 09:50:08 PST 2016
https://llvm.org/bugs/show_bug.cgi?id=26775
Bug ID: 26775
Summary: Wrong code generation in aggressive anti-dependency
breaker
Product: libraries
Version: trunk
Hardware: Other
OS: Linux
Status: NEW
Severity: normal
Priority: P
Component: Common Code Generator Code
Assignee: unassignedbugs at nondot.org
Reporter: uweigand at de.ibm.com
CC: llvm-bugs at lists.llvm.org
Classification: Unclassified
Running the following test case:
target datalayout = "E-m:e-i64:64-n32:64"
target triple = "powerpc64-unknown-linux-gnu"
declare void @func(i8*, i64, i64)
define void @test(i8* %context, i32** %elementArrayPtr, i32 %value) {
entry:
%cmp = icmp eq i32 %value, 0
br i1 %cmp, label %lreturn, label %lnext
lnext:
%elementArray = load i32*, i32** %elementArrayPtr, align 8
%element = load i32, i32* %elementArray, align 4
%element.ext = zext i32 %element to i64
%value.ext = zext i32 %value to i64
call void @func(i8* %context, i64 %value.ext, i64 %element.ext)
br label %lreturn
lreturn:
ret void
}
on powerpc64-linux using the command line:
llc -optimize-regalloc=false -regalloc=fast
results in the following incorrect assembler output:
# BB#1: # %lnext
ld 3, 128(1) # 8-byte Folded Reload
lwz 6, 124(1) # 4-byte Folded Reload
# implicit-def: %X12
ld 4, 0(3)
ld 3, 136(1) # 8-byte Folded Reload
lwz 5, 0(4)
mr 4, 6
clrldi 4, 12, 32
bl func
nop
Note how the "lwz 6, 124(1)" is marked as implicit-def: %X12,
which is of course incorrect, and later how the clrldi uses
register 12, which is actually undefined at this point.
Looking at the MI dumps, the invalid use of %X12 is introduced by the post-RA
scheduling pass, presumably via the anti-dependency breaker. Before the
scheduler, we have:
%X3<def> = LD 128, %X1; mem:LD8[FixedStack1]
%X4<def> = LD 0, %X3<kill>; mem:LD8[%elementArrayPtr]
%X5<def> = LWZ8 0, %X4<kill>; mem:LD4[%elementArray]
%X4<def> = IMPLICIT_DEF
%R6<def> = LWZ 124, %X1; mem:LD4[FixedStack2]
%R4<def> = OR %R6, %R6<kill>
%X4<def> = RLDICL %X4<kill>, 0, 32
%X3<def> = LD 136, %X1; mem:LD8[FixedStack0]
BL8_NOP <ga:@func>
and afterwards we see:
%X3<def> = LD 128, %X1; mem:LD8[FixedStack1]
%R6<def> = LWZ 124, %X1; mem:LD4[FixedStack2]
%X12<def> = IMPLICIT_DEF
%X4<def> = LD 0, %X3<kill>; mem:LD8[%elementArrayPtr]
%X3<def> = LD 136, %X1; mem:LD8[FixedStack0]
%X5<def> = LWZ8 0, %X4<kill>; mem:LD4[%elementArray]
%R4<def> = OR %R6<kill>, %R6<kill>
%X4<def> = RLDICL %X12<kill>, 0, 32
BL8_NOP <ga:@func>
It would appear that the dependency breaker renamed the use-def chain of %X4
from IMPLICIT_DEF to the RLDICL to use %X12 instead, without noticing that
there is a def of %R4 in between (which is a subreg of %X4).
It's not fully clear to me whether use of the "fast" register allocator is
necessary in general to trigger the bug; however, I didn't find any test case
using the default register allocator that would create a register usage pattern
before the post-RA scheduler that shows the issue.
--
You are receiving this mail because:
You are on the CC list for the bug.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-bugs/attachments/20160229/1f4857e5/attachment.html>
More information about the llvm-bugs
mailing list