[SLP Vectorizer]Generate extract for in-tree uses if the use is scalar operand in vectorized instruction

yijiang yjiang at apple.com
Tue Sep 2 13:58:44 PDT 2014


Hi Arnold, 

Thank you for your comments! Actually this patch will affect the cost estimate. In the loop that we go though all the uses and build ExternalUse list, we will check all in-tree uses in InTreeUserNeedToExtract and add it to the ExternalUse if true. If so, the extra element in the list will add costs when we do cost estimation. 



On Sep 2, 2014, at 12:03 PM, Arnold Schwaighofer <aschwaighofer at apple.com> wrote:

> The test case has trailing whitespaces.
> 
> Note, that the ExternalUse created during vectorizing the tree won’t participate in the cost estimate. I am not sure that fixing this would be worth the added complexity.
> 
> Otherwise, LGTM.
> 
> Thanks!
> 
> 
>> On Aug 29, 2014, at 2:02 PM, Yi Jiang <yjiang at apple.com> wrote:
>> 
>> Hi, 
>> 
>> This patch is for radar 18144665 which is exposed by Machael Z's commit.  We will get an assertion if enabling the vectorization of GEP with multiple-uses on this case:
>> 
>> fn2() {
>>   a[11] = a + 11;
>>   a[12] = a + 56;
>> }
>> 
>> The reason of this bug is that our current framework assumes that all in-tree scalars will eventually become vectors so all in-tree uses do not need extract. However, it is not the case for vector instructions that has scalar operands; the scalar operands will remain scalar and we still need the extract.  For example in this test case:
>> 
>>   %0 = load i64** @a, align 8, !tbaa !1
>>   %add.ptr1 = getelementptr inbounds i64* %0, i64 56      // A
>>   %add.ptr = getelementptr inbounds i64* %0, i64 11        // B
>>   %1 = insertelement <2 x i64*> undef, i64* %0, i32 0
>>   %2 = insertelement <2 x i64*> %1, i64* %0, i32 1
>>   %3 = getelementptr <2 x i64*> %2, <2 x i64> <i64 11, i64 56>   // C: Vectorized Value
>>   %4 = ptrtoint <2 x i64*> %3 to <2 x i64>
>>   %arrayidx2 = getelementptr inbounds i64* %0, i64 12
>>   %5 = bitcast i64* %add.ptr to <2 x i64>*    //  D
>>   store <2 x i64> %4, <2 x i64>* %5, align 8, !tbaa !5
>>   ret i32 undef
>> 
>> Now Our framework vectorizes A and B to C. add.ptr(A) is used by D as the pointer operand. Although it is in-tree uses, we still need to extract this value from C.
>> 
>> So here is my proposal in my patch to solve this:
>> 1) When we build ExternalUses List, we need to find out if the in-tree use needs extract instead of ignoring all of them. It will help us to make the cost model right.
>> 2) When we vectorize load and store, currently we will generate bitcast to transfer pointer operand. Now we need to add this bitcast to ExternalUses List and tell the SLP-vectorizer to extract the value. 
>> 
>> By applying this patch, we can re-enable the vectorization of GEP with multiple-uses. Any comment is appreciated. 
>>  
>> 
>> <18144665.patch>
>> _______________________________________________
>> llvm-commits mailing list
>> llvm-commits at cs.uiuc.edu
>> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
> 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20140902/7c797714/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: 18144665-2.patch
Type: application/octet-stream
Size: 8670 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20140902/7c797714/attachment.obj>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20140902/7c797714/attachment-0001.html>


More information about the llvm-commits mailing list