[PATCH] D20474: when calculating RegUsages, ignore instructions which are uniformed after vectorization

Thu Jun 23 14:38:29 PDT 2016

mssimpso added inline comments.

================
Comment at: lib/Transforms/Vectorize/LoopVectorize.cpp:6247-6249
@@ -6238,15 +6246,5 @@
 
-    // Check that the PHI is only used by the induction increment (UpdateV) or
-    // by GEPs. Then check that UpdateV is only used by a compare instruction or
-    // the loop header PHI.
-    // FIXME: Need precise def-use analysis to determine if this instruction
-    // variable will be vectorized.
-    if (std::all_of(PN->user_begin(), PN->user_end(),
-                    [&](const User *U) -> bool {
-                      return U == UpdateV || isa<GetElementPtrInst>(U);
-                    }) &&
-        std::all_of(UpdateV->user_begin(), UpdateV->user_end(),
-                    [&](const User *U) -> bool {
-                      return U == PN || isa<ICmpInst>(U);
-                    })) {
+    if (std::all_of(PN->user_begin(), PN->user_end(), [&](User *U) -> bool {
+          return Legal->isUniformAfterVectorization(cast<Instruction>(U));
+        }))
       VecValuesToIgnore.insert(PN);
----------------
Hi Wei,

I'm joining this review a bit late, so I apologize if I'm not quite up-to-speed yet. But I'm not sure I follow this. Please correct me if I'm missing something!

If I take the following test case:

```
define void @test(i32* %a, i64 %n) {
entry:
  br label %for.body

for.body:
  %i = phi i64 [ %i.next, %for.body ], [ 0, %entry ]
  %0 = trunc i64 %i to i32 
  %1 = getelementptr inbounds i32, i32* %a, i32 %0
  store i32 %0, i32* %1, align 4
  %i.next = add nuw nsw i64 %i, 1
  %cond = icmp eq i64 %i.next, %n
  br i1 %cond, label %for.end, label %for.body

for.end:
  ret void
}
```

We generate vectorized induction variables so that for the store, we have:

```
store <4 x i32> %vec.ind1, <4 x i32>* %3, align 4
```

However, with your change, it looks to me like we will add the induction variable to VecValuesToIngore where previously we wouldn't have. Is this right?



Repository:
  rL LLVM

http://reviews.llvm.org/D20474