[PATCH] D37702: [LV] Clamp the VF to the trip count

Anna Thomas via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Tue Sep 12 07:26:01 PDT 2017


anna added a comment.

@Ayal, any other comments or does this look good to go? Thanks.



================
Comment at: lib/Transforms/Vectorize/LoopVectorize.cpp:6179
   // If we optimize the program for size, avoid creating the tail loop.
-  unsigned TC = PSE.getSE()->getSmallConstantTripCount(TheLoop);
   DEBUG(dbgs() << "LV: Found trip count: " << TC << '\n');
 
----------------
Ayal wrote:
> anna wrote:
> > Ayal wrote:
> > > Probably good to hoist this DEBUG line along with computing TC earlier.
> > When `OptForSize` is true (i.e. at this point in the code), we know for sure that a small trip count value returns a constant trip count or value of 1, if we didn't know the trip count.
> > 
> > I left this here rather than move up because there maybe *more* spurious cases where we do not have a *small* trip count, and the DEBUG statement wouldn't be useful.
> At this point I think TC can also be zero if we don't know it, which is why we check if TC<2 below (i.e., if (TC==0 || TC == 1)).
> 
> Printing out the value of TC when computed above can only aid debugging, afaics.
> 
> In any case, I'm also ok leaving it here; you're also printing TC in computeFeasibleMaxVF() where it clamps MaxVF.
yes, the clamped value is printed in the debug statement. 


================
Comment at: test/Transforms/LoopVectorize/X86/vector_max_bandwidth.ll:54
+; CHECK-LABEL: not_too_small_tc
+; CHECK-AVX1: LV: Selecting VF: 16.
+; CHECK-AVX2: LV: Selecting VF: 16.
----------------
Ayal wrote:
> The max possible vector width for this test on AVX1 is 16, so we're missing the point in checking its selected VF here. Suffice to CHECK-AVX2 only, whose max possible vector width is 32, as noted.
agreed, I'll remove the CHECK-AVX1 check.


https://reviews.llvm.org/D37702





More information about the llvm-commits mailing list