[llvm] r235079 - TRUNCATE constant folding - minor fix for rL233224

Fri Apr 17 06:39:27 PDT 2015

Hey Simon,

Thanks for the pointer and the complete explanations!

I should always RTFM before asking! ;)

— 
Mehdi

> On Apr 17, 2015, at 2:07 AM, Simon Pilgrim <llvm-dev at redking.me.uk> wrote:
> 
> Hi Mehdi,
> 
> Sorry for replying so late to this thread.
> 
> If you consult the description in ISDOpcodes.h:
> 
>     /// BUILD_VECTOR(ELT0, ELT1, ELT2, ELT3,...) - Return a vector with the
>     /// specified, possibly variable, elements.  The number of elements is
>     /// required to be a power of two.  The types of the operands must all be
>     /// the same and must match the vector element type, except that integer
>     /// types are allowed to be larger than the element type, in which case
>     /// the operands are implicitly truncated.
>     BUILD_VECTOR,
> 
> So its only integer types that are allowed to do implicit truncation, float scalar operands must have the same type + size as the resultant vector’s element type. Other scalar <-> vector opcodes have similar funkiness (implicit truncation/sign-extension for integer types) that you should be aware of. We’re making use of this behaviour purely for constant vectors, which target lowering code is more than capable of dealing with (for the legality reasons James discussed), although technically target lowering code must also deal with truncation of non-constant scalar operands, but that isn’t an issue here.
> 
> Also, the other ‘gotcha’ in BUILD_VECTOR is that all scalar operands must be of the same type. So if there are integers that need implicit truncation then they must all have the same type so that they are truncated in the same manner:
> 
>    i32,i32 scalars -> v2i16 // LEGAL
>    i32,i64 scalars -> v2i16 // ILLEGAL
> 
> Simon.
> 
>> On 16 Apr 2015, at 23:13, Mehdi Amini <mehdi.amini at apple.com <mailto:mehdi.amini at apple.com>> wrote:
>> 
>> I see, yes it helps, thanks!
>> 
>> What about floating point types?
>> 
>> — 
>> Mehdi
>> 
>>> On Apr 16, 2015, at 1:03 PM, James Molloy <james at jamesmolloy.co.uk <mailto:james at jamesmolloy.co.uk>> wrote:
>>> 
>>> Right, sorry my answer was rather ARM specific as that was what the testcase that caught this bug was based on. 
>>> 
>>> (On ARM) , i16 is illegal but v4i16 is illegal. There are many other examples in different architectures where the set of scalar legal types isn't an exact match with the set of vector legal types. So BUILD_VECTOR really does need to deal with implicit truncations like this. 
>>> 
>>> In fact, the size of a type may be smaller than the size of its container. Consider an i16 in an i32 register- the upper 16 bits can be undefined in DAG form. We can use an SIGN_EXTEND_INREG for example to widen it to i32, but the input and output type of the sign extend nose will both be i32!
>>> 
>>> I hope this helps,
>>> 
>>> James
>>> On Thu, 16 Apr 2015 at 20:58, Mehdi Amini <mehdi.amini at apple.com <mailto:mehdi.amini at apple.com>> wrote:
>>> Hi James,
>>> 
>>> I’m not sure what you mean about i16 being illegal? Isn’t it something target specific while this commit change a target-independent behavior? What if I have a target where both i32 and i16 are legal?
>>> 
>>> My impression was that if you feed two i32 to BUILD_VECTOR you should get a v2i32, and you can’t get a v2i16 if you don’t feed i16 inputs. But if I understood correctly Ahmed’s answer, there are implicit conversions happening with BUILD_VECTOR.
>>> 
>>> — 
>>> Mehdi
>>> 
>>> 
>>>> On Apr 16, 2015, at 12:41 PM, James Molloy <james at jamesmolloy.co.uk <mailto:james at jamesmolloy.co.uk>> wrote:
>>>> 
>>>> Hi Mehdi,
>>>> 
>>>> It's not altogether surprising- after type legalisation BUILD_VECTOR can't have any illegal types feeding it, and i16 is illegal. I32 is the smallest legal type. 
>>>> 
>>>> Therefore it must be able deal with truncation!
>>>> 
>>>> Cheers,
>>>> 
>>>> James
>>>> On Thu, 16 Apr 2015 at 20:24, Mehdi Amini <mehdi.amini at apple.com <mailto:mehdi.amini at apple.com>> wrote:
>>>> 
>>>> > On Apr 16, 2015, at 10:52 AM, Ahmed Bougacha <ahmed.bougacha at gmail.com <mailto:ahmed.bougacha at gmail.com>> wrote:
>>>> >
>>>> > On Thu, Apr 16, 2015 at 10:44 AM, Mehdi Amini <mehdi.amini at apple.com <mailto:mehdi.amini at apple.com>> wrote:
>>>> >> Hi Simon,
>>>> >>
>>>> >> I have an out-of-tree target broken by this commits, and I have a feeling that the problem is in this change.
>>>> >>
>>>> >> I have this node:  v2i32  BUILD_VECTOR { i32 = Constant<0>,  i32 = undef  }
>>>> >> That I try to truncate:
>>>> >> getNode(ISD::TRUNCATE, dl, v2i16, N)
>>>> >>
>>>> >> After your change, getNode() returns this node:
>>>> >>
>>>> >> N = v2i16  BUILD_VECTOR { i32 = Constant<0>,  i32 = undef  }
>>>> >>
>>>> >> Note that the operands are not truncated, which breaks the DAG type system.
>>>> >
>>>> > It does?  BUILD_VECTOR can truncate integers, and that's the point of
>>>> > this fix, no?
>>>> 
>>>> I didn’t know BUILD_VECTOR had an implicit TRUNCATE.  Does it carry only implicit truncate or other conversion as well (EXTEND for instance)?
>>>> 
>>>> So we are not allowed to fold naively an EXTRACT_VECTOR_ELEMENT that is fed with a BUILD_VECTOR?
>>>> This does not seem in-line with usual behavior in the DAG and it makes combining more complex.
>>>> 
>>>> 
>>>> Thanks,
>>>> 
>>>> Mehdi
>>>> 
>>>> >
>>>> > -Ahmed
>>>> >
>>>> >> I assume you are relying on FoldConstantArithmetic() to magically convert the constant to the truncated type, however it does not work with undef.
>>>> >>
>>>> >> I have the feeling that the proper way requires to wrap every operand with a TRUNCATE before calling getNode, but you may see an alternative?
>>>> >>
>>>> >> Thanks,
>>>> >>
>>>> >> Mehdi
>>>> >>
>>>> >>
>>>> >>
>>>> >>> On Apr 16, 2015, at 1:21 AM, Simon Pilgrim <llvm-dev at redking.me.uk <mailto:llvm-dev at redking.me.uk>> wrote:
>>>> >>>
>>>> >>> Author: rksimon
>>>> >>> Date: Thu Apr 16 03:21:09 2015
>>>> >>> New Revision: 235079
>>>> >>>
>>>> >>> URL: http://llvm.org/viewvc/llvm-project?rev=235079&view=rev <http://llvm.org/viewvc/llvm-project?rev=235079&view=rev>
>>>> >>> Log:
>>>> >>> TRUNCATE constant folding - minor fix for rL233224
>>>> >>>
>>>> >>> Fix for test case found by James Molloy - TRUNCATE of constant build vectors can be more simply achieved by simply replacing with a new build vector node with the truncated value type - no need to touch the scalar operands at all.
>>>> >>>
>>>> >>> Added:
>>>> >>>   llvm/trunk/test/CodeGen/AArch64/fold-constants.ll
>>>> >>> Modified:
>>>> >>>   llvm/trunk/lib/CodeGen/SelectionDAG/SelectionDAG.cpp
>>>> >>>
>>>> >>> Modified: llvm/trunk/lib/CodeGen/SelectionDAG/SelectionDAG.cpp
>>>> >>> URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/CodeGen/SelectionDAG/SelectionDAG.cpp?rev=235079&r1=235078&r2=235079&view=diff <http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/CodeGen/SelectionDAG/SelectionDAG.cpp?rev=235079&r1=235078&r2=235079&view=diff>
>>>> >>> ==============================================================================
>>>> >>> --- llvm/trunk/lib/CodeGen/SelectionDAG/SelectionDAG.cpp (original)
>>>> >>> +++ llvm/trunk/lib/CodeGen/SelectionDAG/SelectionDAG.cpp Thu Apr 16 03:21:09 2015
>>>> >>> @@ -2851,13 +2851,16 @@ SDValue SelectionDAG::getNode(unsigned O
>>>> >>>        // FIXME: Entirely reasonable to perform folding of other unary
>>>> >>>        // operations here as the need arises.
>>>> >>>        break;
>>>> >>> +      case ISD::TRUNCATE:
>>>> >>> +        // Constant build vector truncation can be done with the original scalar
>>>> >>> +        // operands but with a new build vector with the truncated value type.
>>>> >>> +        return getNode(ISD::BUILD_VECTOR, DL, VT, BV->ops());
>>>> >>>      case ISD::FNEG:
>>>> >>>      case ISD::FABS:
>>>> >>>      case ISD::FCEIL:
>>>> >>>      case ISD::FTRUNC:
>>>> >>>      case ISD::FFLOOR:
>>>> >>>      case ISD::FP_EXTEND:
>>>> >>> -      case ISD::TRUNCATE:
>>>> >>>      case ISD::UINT_TO_FP:
>>>> >>>      case ISD::SINT_TO_FP: {
>>>> >>>        // Let the above scalar folding handle the folding of each element.
>>>> >>>
>>>> >>> Added: llvm/trunk/test/CodeGen/AArch64/fold-constants.ll
>>>> >>> URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/AArch64/fold-constants.ll?rev=235079&view=auto <http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/AArch64/fold-constants.ll?rev=235079&view=auto>
>>>> >>> ==============================================================================
>>>> >>> --- llvm/trunk/test/CodeGen/AArch64/fold-constants.ll (added)
>>>> >>> +++ llvm/trunk/test/CodeGen/AArch64/fold-constants.ll Thu Apr 16 03:21:09 2015
>>>> >>> @@ -0,0 +1,21 @@
>>>> >>> +; RUN: llc -mtriple=aarch64-linux-gnu -o - %s | FileCheck %s
>>>> >>> +
>>>> >>> +define i64 @dotests_616() {
>>>> >>> +; CHECK-LABEL: dotests_616
>>>> >>> +; CHECK:       movi d0, #0000000000000000
>>>> >>> +; CHECK-NEXT:  umov w8, v0.b[2]
>>>> >>> +; CHECK-NEXT:  sbfx w8, w8, #0, #1
>>>> >>> +; CHECK-NEXT:  fmov s0, w8
>>>> >>> +; CHECK-NEXT:  fmov x0, d0
>>>> >>> +; CHECK-NEXT:  ret
>>>> >>> +entry:
>>>> >>> +  %0 = bitcast <2 x i64> zeroinitializer to <8 x i16>
>>>> >>> +  %1 = and <8 x i16> zeroinitializer, %0
>>>> >>> +  %2 = icmp ne <8 x i16> %1, zeroinitializer
>>>> >>> +  %3 = extractelement <8 x i1> %2, i32 2
>>>> >>> +  %vgetq_lane285 = sext i1 %3 to i16
>>>> >>> +  %vset_lane = insertelement <4 x i16> undef, i16 %vgetq_lane285, i32 0
>>>> >>> +  %4 = bitcast <4 x i16> %vset_lane to <1 x i64>
>>>> >>> +  %vget_lane = extractelement <1 x i64> %4, i32 0
>>>> >>> +  ret i64 %vget_lane
>>>> >>> +}
>>>> >>>
>>>> >>>
>>>> >>> _______________________________________________
>>>> >>> llvm-commits mailing list
>>>> >>> llvm-commits at cs.uiuc.edu <mailto:llvm-commits at cs.uiuc.edu>
>>>> >>> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits <http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits>
>>>> >>
>>>> >>
>>>> >> _______________________________________________
>>>> >> llvm-commits mailing list
>>>> >> llvm-commits at cs.uiuc.edu <mailto:llvm-commits at cs.uiuc.edu>
>>>> >> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits <http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits>
>>>> 
>>>> 
>>>> _______________________________________________
>>>> llvm-commits mailing list
>>>> llvm-commits at cs.uiuc.edu <mailto:llvm-commits at cs.uiuc.edu>
>>>> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits <http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits>
>> 
>> _______________________________________________
>> llvm-commits mailing list
>> llvm-commits at cs.uiuc.edu <mailto:llvm-commits at cs.uiuc.edu>
>> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
> 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20150417/09f9f55a/attachment.html>