[PATCH] D20474: when calculating RegUsages, ignore instructions which are uniformed after vectorization

Thu Jun 23 15:07:11 PDT 2016

wmi added inline comments.

================
Comment at: lib/Transforms/Vectorize/LoopVectorize.cpp:6247-6249
@@ -6238,15 +6246,5 @@
 
-    // Check that the PHI is only used by the induction increment (UpdateV) or
-    // by GEPs. Then check that UpdateV is only used by a compare instruction or
-    // the loop header PHI.
-    // FIXME: Need precise def-use analysis to determine if this instruction
-    // variable will be vectorized.
-    if (std::all_of(PN->user_begin(), PN->user_end(),
-                    [&](const User *U) -> bool {
-                      return U == UpdateV || isa<GetElementPtrInst>(U);
-                    }) &&
-        std::all_of(UpdateV->user_begin(), UpdateV->user_end(),
-                    [&](const User *U) -> bool {
-                      return U == PN || isa<ICmpInst>(U);
-                    })) {
+    if (std::all_of(PN->user_begin(), PN->user_end(), [&](User *U) -> bool {
+          return Legal->isUniformAfterVectorization(cast<Instruction>(U));
+        }))
       VecValuesToIgnore.insert(PN);
----------------
mssimpso wrote:
> Hi Wei,
> 
> I'm joining this review a bit late, so I apologize if I'm not quite up-to-speed yet. But I'm not sure I follow this. Please correct me if I'm missing something!
> 
> If I take the following test case:
> 
> ```
> define void @test(i32* %a, i64 %n) {
> entry:
>   br label %for.body
> 
> for.body:
>   %i = phi i64 [ %i.next, %for.body ], [ 0, %entry ]
>   %0 = trunc i64 %i to i32 
>   %1 = getelementptr inbounds i32, i32* %a, i32 %0
>   store i32 %0, i32* %1, align 4
>   %i.next = add nuw nsw i64 %i, 1
>   %cond = icmp eq i64 %i.next, %n
>   br i1 %cond, label %for.end, label %for.body
> 
> for.end:
>   ret void
> }
> ```
> 
> We generate vectorized induction variables so that for the store, we have:
> 
> ```
> store <4 x i32> %vec.ind1, <4 x i32>* %3, align 4
> ```
> 
> However, with your change, it looks to me like we will add the induction variable to VecValuesToIngore where previously we wouldn't have. Is this right?
> 
Thanks for pointing out the problem. For your testcase, %1 is only used in instruction %0 = ... which is a uniform instruction so it will be put into VecValuesToIngore. However, it may be problematic for %0 to be uniform because actually it has both scalar and vector version after vectorization.

I will add your testcase into consideration. Thanks.

 

  


Repository:
  rL LLVM

http://reviews.llvm.org/D20474