[llvm] r184575 - Fix PR16360

Fri Jun 21 17:33:07 PDT 2013

Good point. I thought the original transformation is targeted to reduce
overhead of shift in some scenarios when data types must be extended,
e.g. 64-bit integer on 32-bit targets. It's painful to do that if all
significant bits are required.

For 64-bit targets, reducing a 64-bit shift to a 32-bit one might have
benefit as well. E.g., you may save one byte on x86-64 without encoding
the W bit.

We could go even aggressively by folding (srl (anyextend x), c) to (zext
(srl x, c)) to help generate efficient code on 64-bit targets as 'zext'
has better support. I verified it on x86-64 and ppc64 where the same
code is generated before the bug is fixed. I checked aarch64 and mips64
as well but it seems those targets doesn't take the advantage that high
bits are set to ZERO (I learned from public source that, with 32-bit
instruction, on aarch64, the upper 32 bits of the destination register
are set to ZERO and, on mips64, the upper 32 bits of the destination
register will be sign extended and will be cleared for this case.)
Please check the patch attached.

- michael

On Fri, 2013-06-21 at 13:39 -0700, Nadav Rotem wrote:
> I understand that you fixed this transformation but now it does not look profitable. Do you think that it is profitable for all platforms ? If not then you can either erase this code or move it to the backends that do support it. 
> 
> > On Jun 21, 2013, at 13:02, Michael Liao <michael.liao at intel.com> wrote:
> > 
> > Hi Nadav
> > 
> > That patch make that transformation semantically correct as (srl
> > (anyextend x), c) will clear off the high bits but (anyextend (srl x,
> > c)) will leave high bits undefined.
> > 
> > BTW, PR16360 is not a target specific issue as reported
> > http://llvm.org/bugs/show_bug.cgi?id=16360
> > 
> > Thanks
> > - Michael
> > 
> >> On Fri, 2013-06-21 at 12:22 -0700, Nadav Rotem wrote:
> >> It is not clear to me why this transformation is profitable for all
> >> platforms. You are adding an AND operation here. Should this
> >> optimization be target specific maybe ?
> >> 
> >> On Jun 21, 2013, at 11:45 AM, Michael Liao <michael.liao at intel.com>
> >> wrote:
> >> 
> >>> Author: hliao
> >>> Date: Fri Jun 21 13:45:27 2013
> >>> New Revision: 184575
> >>> 
> >>> URL: http://llvm.org/viewvc/llvm-project?rev=184575&view=rev
> >>> Log:
> >>> Fix PR16360
> >>> 
> >>> When (srl (anyextend x), c) is folded into (anyextend (srl x, c)),
> >>> the
> >>> high bits are not cleared. Add 'and' to clear off them.
> >>> 
> >>> 
> >>> Added:
> >>>   llvm/trunk/test/CodeGen/X86/pr16360.ll
> >>> Modified:
> >>>   llvm/trunk/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
> >>> 
> >>> Modified: llvm/trunk/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
> >>> URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/CodeGen/SelectionDAG/DAGCombiner.cpp?rev=184575&r1=184574&r2=184575&view=diff
> >>> ==============================================================================
> >>> --- llvm/trunk/lib/CodeGen/SelectionDAG/DAGCombiner.cpp (original)
> >>> +++ llvm/trunk/lib/CodeGen/SelectionDAG/DAGCombiner.cpp Fri Jun 21
> >>> 13:45:27 2013
> >>> @@ -3915,8 +3915,7 @@ SDValue DAGCombiner::visitSRL(SDNode *N)
> >>>                       DAG.getConstant(~0ULL >> ShAmt, VT));
> >>>  }
> >>> 
> >>> -
> >>> -  // fold (srl (anyextend x), c) -> (anyextend (srl x, c))
> >>> +  // fold (srl (anyextend x), c) -> (and (anyextend (srl x, c)),
> >>> mask)
> >>>  if (N1C && N0.getOpcode() == ISD::ANY_EXTEND) {
> >>>    // Shifting in all undef bits?
> >>>    EVT SmallVT = N0.getOperand(0).getValueType();
> >>> @@ -3929,7 +3928,10 @@ SDValue DAGCombiner::visitSRL(SDNode *N)
> >>>                                       N0.getOperand(0),
> >>>                          DAG.getConstant(ShiftAmt,
> >>> getShiftAmountTy(SmallVT)));
> >>>      AddToWorkList(SmallShift.getNode());
> >>> -      return DAG.getNode(ISD::ANY_EXTEND, SDLoc(N), VT,
> >>> SmallShift);
> >>> +      APInt Mask =
> >>> APInt::getAllOnesValue(VT.getSizeInBits()).lshr(ShiftAmt);
> >>> +      return DAG.getNode(ISD::AND, SDLoc(N), VT,
> >>> +                         DAG.getNode(ISD::ANY_EXTEND, SDLoc(N), VT,
> >>> SmallShift),
> >>> +                         DAG.getConstant(Mask, VT));
> >>>    }
> >>>  }
> >>> 
> >>> 
> >>> Added: llvm/trunk/test/CodeGen/X86/pr16360.ll
> >>> URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/X86/pr16360.ll?rev=184575&view=auto
> >>> ==============================================================================
> >>> --- llvm/trunk/test/CodeGen/X86/pr16360.ll (added)
> >>> +++ llvm/trunk/test/CodeGen/X86/pr16360.ll Fri Jun 21 13:45:27 2013
> >>> @@ -0,0 +1,16 @@
> >>> +; RUN: llc < %s -mtriple=i686-pc-linux | FileCheck %s
> >>> +
> >>> +define i64 @foo(i32 %sum) {
> >>> +entry:
> >>> +  %conv = sext i32 %sum to i64
> >>> +  %shr = lshr i64 %conv, 2
> >>> +  %or = or i64 4611686018360279040, %shr
> >>> +  ret i64 %or
> >>> +}
> >>> +
> >>> +; CHECK: foo
> >>> +; CHECK: shrl $2
> >>> +; CHECK: orl $-67108864
> >>> +; CHECK-NOT: movl $-1
> >>> +; CHECK: movl $1073741823
> >>> +; CHECK: ret
> >>> 
> >>> 
> >>> _______________________________________________
> >>> llvm-commits mailing list
> >>> llvm-commits at cs.uiuc.edu
> >>> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
> >> 
> >> 
> > 
> > 

-------------- next part --------------
A non-text attachment was scrubbed...
Name: 0001-Revise-fix-to-PR16360-for-efficient-code-on-64-bit-t.patch
Type: text/x-patch
Size: 2192 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20130621/0d61b07a/attachment.bin>