[LLVMbugs] [Bug 22074] New: Redundant copy in reduction loop
bugzilla-daemon at llvm.org
bugzilla-daemon at llvm.org
Wed Dec 31 06:25:32 PST 2014
http://llvm.org/bugs/show_bug.cgi?id=22074
Bug ID: 22074
Summary: Redundant copy in reduction loop
Product: libraries
Version: trunk
Hardware: All
OS: All
Status: NEW
Severity: normal
Priority: P
Component: Common Code Generator Code
Assignee: unassignedbugs at nondot.org
Reporter: michael.m.kuperstein at intel.com
CC: llvmbugs at cs.uiuc.edu
Classification: Unclassified
The code below performs a trivial reduction on an array (basically, calculating
a checksum):
target triple = "i386-linux"
define zeroext i16 @loop(i16* readonly %p, i16* readnone %q) #0 {
entry:
%cmp4 = icmp eq i16* %p, %q
br i1 %cmp4, label %while.end, label %while.body
while.body:
%v.06 = phi i32 [ %add, %while.body ], [ 0, %entry ]
%p.addr.05 = phi i16* [ %incdec.ptr, %while.body ], [ %p, %entry ]
%incdec.ptr = getelementptr inbounds i16* %p.addr.05, i32 1
%0 = load i16* %p.addr.05, align 2
%conv = zext i16 %0 to i32
%add = add i32 %conv, %v.06
%cmp = icmp eq i16* %incdec.ptr, %q
br i1 %cmp, label %while.cond.while.end_crit_edge, label %while.body
while.cond.while.end_crit_edge:
%phitmp = trunc i32 %add to i16
br label %while.end
while.end:
%v.0.lcssa = phi i16 [ %phitmp, %while.cond.while.end_crit_edge ], [ 0,
%entry ]
ret i16 %v.0.lcssa
}
The code we currently get for the loop body is:
.LBB0_1:
movl %eax, %esi
movzwl (%edx), %eax
addl $2, %edx
addl %esi, %eax
cmpl %edx, %ecx
jne .LBB0_1
.LBB0_2:
Instead of the expected:
.LBB0_1:
movzwl (%edx), %esi
addl $2, %edx
addl %esi, %eax
cmpl %edx, %ecx
jne .LBB0_1
.LBB0_2:
The problem is that instead of adding the loaded value into the accumulator, we
do the reverse, forcing us to copy the accumulator between iterations.
There are three different things here that could go right instead of going
wrong:
1) The two-address instruction pass tries to guess when it's worth to commute
the instruction when doing the 3-addr -> 2-addr transformation. The heuristic
there is rather limited, and it doesn't fire in this case.
2) The register coalescer also tries to commute two-address instructions when
it runs into situations it considers profitable.
Specifically, if it gets this:
BB#2:
%vreg3<def> = MOVZX32rm16 %vreg15, 1, %noreg, 0, %noreg; mem:LD2[%p.addr.05]
GR32:%vreg3,%vreg15
%vreg15<def,tied1> = ADD32ri8 %vreg15<tied0>, 2, %EFLAGS<imp-def,dead>;
GR32:%vreg15
%vreg3<def,tied1> = ADD32rr %vreg3<tied0>, %vreg14, %EFLAGS<imp-def,dead>;
GR32:%vreg3,%vreg14
CMP32rr %vreg7, %vreg15, %EFLAGS<imp-def>; GR32:%vreg7,%vreg15
%vreg14<def> = COPY %vreg3; GR32:%vreg14,%vreg3
JNE_4 <BB#2>, %EFLAGS<imp-use,kill>
JMP_4 <BB#3>
Where it can't coalesce vreg14 and vreg3 because of the def of vreg3 in the
beginning of the loop, it will commute the ADD32rr that is the def of the
copied register, making the copy a nop.
However, what we actually get is:
BB#0:
[...]
%vreg14<def> = MOV32r0 %EFLAGS<imp-def,dead>; GR32:%vreg14
[...]
BB#2:
%vreg0<def> = COPY %vreg14; GR32:%vreg0,%vreg14
%vreg14<def> = MOVZX32rm16 %vreg15, 1, %noreg, 0, %noreg; mem:LD2[%p.addr.05]
GR32:%vreg14,%vreg15
%vreg15<def,tied1> = ADD32ri8 %vreg15<tied0>, 2, %EFLAGS<imp-def,dead>;
GR32:%vreg15
%vreg14<def,tied1> = ADD32rr %vreg14<tied0>, %vreg0, %EFLAGS<imp-def,dead>;
GR32:%vreg14,%vreg0
CMP32rr %vreg7, %vreg15, %EFLAGS<imp-def>; GR32:%vreg7,%vreg15
JNE_4 <BB#2>, %EFLAGS<imp-use,kill>
JMP_4 <BB#3>
The copy we have is the remnant of PHI elimination, and can see two defs of
vreg14. This pattern isn't caught by the current code, so we don't commute.
3) The reason we see the "wrong" copy has to do with the order register
coalescing tries to eliminate the copies. When eliminating the copies bottom
-up (-join-globalcopies=false) we end up with the copy the coalescer will try
to commute. However, with join-globalcopies enabled, we try to eliminate the
"outgoing" copy of %vreg14 first, and get stuck with the one we don't know how
to handle.
Any suggestions on where this should be fixed - (1), (2) or (3)?
I'm a bit at a loss...
--
You are receiving this mail because:
You are on the CC list for the bug.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-bugs/attachments/20141231/04f11f88/attachment.html>
More information about the llvm-bugs
mailing list