[PATCH] Fix for previously reverted SLPVectorizer cost model improvement

Tue Apr 8 07:50:46 PDT 2014

On Mon, Apr 7, 2014 at 6:34 PM, Arnold Schwaighofer
<aschwaighofer at apple.com> wrote:
>
> On Apr 6, 2014, at 9:14 PM, Nadav Rotem <nrotem at apple.com> wrote:
>
>> +    static inline bool isTriviallyVectorizable(Intrinsic::ID ID) {
>>
>> I don't think that this function belongs in include/llvm/IR/IntrinsicInst.h.  This function describes the current capabilities of the loop vectorizer. Would you add a similar function that describes functions that can be LICM-ed or Scalarized or Unrolled into Intrinsics.h?  Moreover, we could easily teach the vectorizer to vectorize these functions (we have done it with Intel's OpenCL vectorizer).
>>
>
> I think this api describes a property of the intrinsic instruction: Whether all operands are vector operands of the same width when the instruction is vectorized. Maybe, we should name it such: areAllOpdsIdenticallyWidened?
>
> It is used by the two vectorizers to determine whether they can vectorize the instruction by simply widening the types:
>
>   float @llvm.pow.f32(float  %Val, i32 %power) => <2 x float> @llvm.pow.f32(<2 x float> %Val, <2 x i32> %power)
>
> in contrast to:
>
>   float @llvm.powi.f32(float  %Val, i32 %power) => <2 x float> @llvm.powi.f32(<2 x float> %Val, i32 %power)
>

I think I agree with Nadav here. It's a piece of knowledge that's not
generally useful outside of the vectorizer pass and there's really no
reason for it to be in Intrinsics.h. It describes a property, but we
should only put things in Intrinsics.h if they describe properties on
the intrinsic definition.

Thoughts?

-eric

>
> Best,
> Arnold
>
>
>
>>
>> On Apr 6, 2014, at 5:47 PM, Arnold Schwaighofer <aschwaighofer at apple.com> wrote:
>>
>>> Can you change the comment to:
>>>
>>>
>>> +    /// This method returns true if the intrinsic's argument types are all scalars
>>>
>>>
>>> from:
>>>
>>>
>>> +    /// Method returns true if intrinsic arguments types are all scalars
>>> +    /// for the scalar form of the intrinsic and all vectors for the vector
>>> +    /// form of the intrinsic.
>>>
>>>
>>>
>>> with this change this LGTM.
>>>
>>>
>>>
>>> On 04/04/14, Tyler Nowicki   wrote:
>>>> Thanks for the review
>>>>
>>>> Here is an updated patch. I made the changes you asked for including splitting out the unrelated code, modifying the comment, removing the lifetime intrinsics. I also adding ctpop which was previously missing.
>>>>
>>>> Tyler
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> On Apr 3, 2014, at 4:30 PM, arnold <aschwaighofer at apple.com> wrote:
>>>>
>>>>> --- include/llvm/IR/IntrinsicInst.h        (revision 205571)
>>>>> +++ include/llvm/IR/IntrinsicInst.h        (working copy)
>>>>> @@ -44,6 +44,38 @@
>>>>>     return (Intrinsic::ID)getCalledFunction()->getIntrinsicID();
>>>>>   }
>>>>>
>>>>> +    /// Method returns true if any intrinsic argument types are scalar
>>>>> +    /// in both vector and scalar forms of the intrisinc.
>>>>>
>>>>> The other way round (except for lifetime_start). Like you describe below in the explanation of the patch.
>>>>>
>>>>> +    ///
>>>>> +    static inline bool isTriviallyVectorizable(Intrinsic::ID ID) {
>>>>> +      switch (ID) {
>>>>> +        case Intrinsic::sort:
>>>>> ...
>>>>> +        case Intrinsic::lifetime_start:
>>>>> +        case Intrinsic::lifetime_end:
>>>>> +          return true;
>>>>> +        default:
>>>>> +          return false;
>>>>> +      }
>>>>> +    }
>>>>>
>>>>> Because of Intrinsic::lifetime_start the comment is not totally true. A lifetime_start/end is vectorizable because we simple scalarize it in the loop vectorizer. For all other intrinsics we can vectorize them because all their arguments become vectors when vectorized (like you mention below). I think lifetime_start/stop should be handled separately (not be checked as part of isTriviallyVectorizable).
>>>>>
>>>>> Can you separate the patch into two patches? One that corrects the handling of mixed argument intrinsic and the second one reverting my revert.
>>>>>
>>>>> You can test the first patch by moving both insertelement instructions in the example non-vectorizable-intrinsic.ll before the return.
>>>>>
>>>>> Thanks for working on this!
>>>>>
>>>>> On Apr 3, 2014, at 2:44 PM, Tyler Nowicki <tnowicki at apple.com> wrote:
>>>>>
>>>>>> Hi,
>>>>>>
>>>>>> This patch modifies r205260 which was previously reverted by r205018 because it a crash. The original patch by Arch Robison and Arnold Schwaighofer.
>>>>>>
>>>>>> This patch stops the SLP vectorizer from trying to vectorize a call that is not trivially vectorizable. An intrinsic is not trivially vectorized if any of the arguments cannot be simply packed together into a tuple. For example, the call to CTLZ cannot be trivially replaced with a vector intrinsic because the second parameter is scalar in both its scalar and vector forms.
>>>>>>
>>>>>> Tyler Nowicki
>>>>>>
>>>>>> <slp_insertelements_fix2.patch>_______________________________________________
>>>>>> llvm-commits mailing list
>>>>>> llvm-commits at cs.uiuc.edu
>>>>>> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
>>>>>
>>>
>>>
>>> _______________________________________________
>>> llvm-commits mailing list
>>> llvm-commits at cs.uiuc.edu
>>> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
>>
>
>
> _______________________________________________
> llvm-commits mailing list
> llvm-commits at cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits