[llvm] r184575 - Fix PR16360

Thu Jun 27 10:39:46 PDT 2013

I am not sure how this patch addresses the cost problem. You are still generating a new ZEXT instruction.  Why is it better for all platforms ? Is it better for AMD’s GPU ? If it is better only for ARM and X86 then you need to put it in these backends. 

On Jun 23, 2013, at 10:39 PM, "Liao, Michael" <michael.liao at intel.com> wrote:

> Sorry, I don't why the system put that mail into junk box. I just check that about. In case you miss my reply. I put it here again.
> 
> Good point. I thought the original transformation is targeted to reduce overhead of shift in some scenarios when data types must be extended, e.g. 64-bit integer on 32-bit targets. It's painful to do that if all significant bits are required.
> 
> For 64-bit targets, reducing a 64-bit shift to a 32-bit one might have benefit as well. E.g., you may save one byte on x86-64 without encoding the W bit.
> 
> We could go even aggressively by folding (srl (anyextend x), c) to (zext (srl x, c)) to help generate efficient code on all targets as 'zext'
> has better support. I verified it on x86-64 and ppc64 where the same code is generated before the bug is fixed. I checked aarch64 and mips64 as well but it seems those targets doesn't take the advantage that high bits are set to ZERO (I learned from public source that, with 32-bit instruction, on aarch64, the upper 32 bits of the destination register are set to ZERO and, on mips64, the upper 32 bits of the destination register will be sign extended and will be cleared for this case.) Please check the patch attached.
> 
> - michael
> 
> 
> -----Original Message-----
> From: llvm-commits-bounces at cs.uiuc.edu [mailto:llvm-commits-bounces at cs.uiuc.edu] On Behalf Of Duncan Sands
> Sent: Sunday, June 23, 2013 5:49 AM
> To: llvm-commits at cs.uiuc.edu
> Subject: Re: [llvm] r184575 - Fix PR16360
> 
> On 21/06/13 21:22, Nadav Rotem wrote:
>> It is not clear to me why this transformation is profitable for all platforms.
>> You are adding an AND operation here. Should this optimization be 
>> target specific maybe ?
> 
> I made the same point and also mentioned a better way of generating the mask, but Michael seems to have ignored my email.
> 
> Ciao, Duncan.
> 
>> 
>> On Jun 21, 2013, at 11:45 AM, Michael Liao <michael.liao at intel.com 
>> <mailto:michael.liao at intel.com>> wrote:
>> 
>>> Author: hliao
>>> Date: Fri Jun 21 13:45:27 2013
>>> New Revision: 184575
>>> 
>>> URL:http://llvm.org/viewvc/llvm-project?rev=184575&view=rev
>>> Log:
>>> Fix PR16360
>>> 
>>> When (srl (anyextend x), c) is folded into (anyextend (srl x, c)), 
>>> the high bits are not cleared. Add 'and' to clear off them.
>>> 
>>> 
>>> Added:
>>>   llvm/trunk/test/CodeGen/X86/pr16360.ll
>>> Modified:
>>>   llvm/trunk/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
>>> 
>>> Modified: llvm/trunk/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
>>> URL:http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/CodeGen/Select
>>> ionDAG/DAGCombiner.cpp?rev=184575&r1=184574&r2=184575&view=diff
>>> =====================================================================
>>> =========
>>> --- llvm/trunk/lib/CodeGen/SelectionDAG/DAGCombiner.cpp (original)
>>> +++ llvm/trunk/lib/CodeGen/SelectionDAG/DAGCombiner.cpp Fri Jun 21 
>>> +++ 13:45:27 2013
>>> @@ -3915,8 +3915,7 @@ SDValue DAGCombiner::visitSRL(SDNode *N)
>>>                       DAG.getConstant(~0ULL >> ShAmt, VT));
>>>  }
>>> 
>>> -
>>> -  // fold (srl (anyextend x), c) -> (anyextend (srl x, c))
>>> +  // fold (srl (anyextend x), c) -> (and (anyextend (srl x, c)), 
>>> + mask)
>>>  if (N1C && N0.getOpcode() == ISD::ANY_EXTEND) {
>>>    // Shifting in all undef bits?
>>>    EVT SmallVT = N0.getOperand(0).getValueType(); @@ -3929,7 
>>> +3928,10 @@ SDValue DAGCombiner::visitSRL(SDNode *N)
>>>                                       N0.getOperand(0),
>>>                          DAG.getConstant(ShiftAmt, getShiftAmountTy(SmallVT)));
>>>      AddToWorkList(SmallShift.getNode());
>>> -      return DAG.getNode(ISD::ANY_EXTEND, SDLoc(N), VT, SmallShift);
>>> +      APInt Mask = APInt::getAllOnesValue(VT.getSizeInBits()).lshr(ShiftAmt);
>>> +      return DAG.getNode(ISD::AND, SDLoc(N), VT,
>>> +                         DAG.getNode(ISD::ANY_EXTEND, SDLoc(N), VT, SmallShift),
>>> +                         DAG.getConstant(Mask, VT));
>>>    }
>>>  }
>>> 
>>> 
>>> Added: llvm/trunk/test/CodeGen/X86/pr16360.ll
>>> URL:http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/X86/p
>>> r16360.ll?rev=184575&view=auto 
>>> =====================================================================
>>> =========
>>> --- llvm/trunk/test/CodeGen/X86/pr16360.ll (added)
>>> +++ llvm/trunk/test/CodeGen/X86/pr16360.ll Fri Jun 21 13:45:27 2013
>>> @@ -0,0 +1,16 @@
>>> +; RUN: llc < %s -mtriple=i686-pc-linux | FileCheck %s
>>> +
>>> +define i64 @foo(i32 %sum) {
>>> +entry:
>>> +  %conv = sext i32 %sum to i64
>>> +  %shr = lshr i64 %conv, 2
>>> +  %or = or i64 4611686018360279040, %shr
>>> +  ret i64 %or
>>> +}
>>> +
>>> +; CHECK: foo
>>> +; CHECK: shrl $2
>>> +; CHECK: orl $-67108864
>>> +; CHECK-NOT: movl $-1
>>> +; CHECK: movl $1073741823
>>> +; CHECK: ret
>>> 
>>> 
>>> _______________________________________________
>>> llvm-commits mailing list
>>> llvm-commits at cs.uiuc.edu <mailto:llvm-commits at cs.uiuc.edu> 
>>> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
>> 
>> 
>> 
>> _______________________________________________
>> llvm-commits mailing list
>> llvm-commits at cs.uiuc.edu
>> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
>> 
> 
> _______________________________________________
> llvm-commits mailing list
> llvm-commits at cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
> <0001-Revise-fix-to-PR16360-for-efficient-code-on-64-bit-t.patch>_______________________________________________
> llvm-commits mailing list
> llvm-commits at cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20130627/49a37647/attachment.html>