[PATCH] D20474: when calculating RegUsages, ignore instructions which are uniformed after vectorization
Wei Mi via llvm-commits
llvm-commits at lists.llvm.org
Thu Jun 23 15:07:11 PDT 2016
wmi added inline comments.
================
Comment at: lib/Transforms/Vectorize/LoopVectorize.cpp:6247-6249
@@ -6238,15 +6246,5 @@
- // Check that the PHI is only used by the induction increment (UpdateV) or
- // by GEPs. Then check that UpdateV is only used by a compare instruction or
- // the loop header PHI.
- // FIXME: Need precise def-use analysis to determine if this instruction
- // variable will be vectorized.
- if (std::all_of(PN->user_begin(), PN->user_end(),
- [&](const User *U) -> bool {
- return U == UpdateV || isa<GetElementPtrInst>(U);
- }) &&
- std::all_of(UpdateV->user_begin(), UpdateV->user_end(),
- [&](const User *U) -> bool {
- return U == PN || isa<ICmpInst>(U);
- })) {
+ if (std::all_of(PN->user_begin(), PN->user_end(), [&](User *U) -> bool {
+ return Legal->isUniformAfterVectorization(cast<Instruction>(U));
+ }))
VecValuesToIgnore.insert(PN);
----------------
mssimpso wrote:
> Hi Wei,
>
> I'm joining this review a bit late, so I apologize if I'm not quite up-to-speed yet. But I'm not sure I follow this. Please correct me if I'm missing something!
>
> If I take the following test case:
>
> ```
> define void @test(i32* %a, i64 %n) {
> entry:
> br label %for.body
>
> for.body:
> %i = phi i64 [ %i.next, %for.body ], [ 0, %entry ]
> %0 = trunc i64 %i to i32
> %1 = getelementptr inbounds i32, i32* %a, i32 %0
> store i32 %0, i32* %1, align 4
> %i.next = add nuw nsw i64 %i, 1
> %cond = icmp eq i64 %i.next, %n
> br i1 %cond, label %for.end, label %for.body
>
> for.end:
> ret void
> }
> ```
>
> We generate vectorized induction variables so that for the store, we have:
>
> ```
> store <4 x i32> %vec.ind1, <4 x i32>* %3, align 4
> ```
>
> However, with your change, it looks to me like we will add the induction variable to VecValuesToIngore where previously we wouldn't have. Is this right?
>
Thanks for pointing out the problem. For your testcase, %1 is only used in instruction %0 = ... which is a uniform instruction so it will be put into VecValuesToIngore. However, it may be problematic for %0 to be uniform because actually it has both scalar and vector version after vectorization.
I will add your testcase into consideration. Thanks.
Repository:
rL LLVM
http://reviews.llvm.org/D20474
More information about the llvm-commits
mailing list