[PATCH] D37702: [LV] Clamp the VF to the trip count
Anna Thomas via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Tue Sep 12 07:26:01 PDT 2017
anna added a comment.
@Ayal, any other comments or does this look good to go? Thanks.
================
Comment at: lib/Transforms/Vectorize/LoopVectorize.cpp:6179
// If we optimize the program for size, avoid creating the tail loop.
- unsigned TC = PSE.getSE()->getSmallConstantTripCount(TheLoop);
DEBUG(dbgs() << "LV: Found trip count: " << TC << '\n');
----------------
Ayal wrote:
> anna wrote:
> > Ayal wrote:
> > > Probably good to hoist this DEBUG line along with computing TC earlier.
> > When `OptForSize` is true (i.e. at this point in the code), we know for sure that a small trip count value returns a constant trip count or value of 1, if we didn't know the trip count.
> >
> > I left this here rather than move up because there maybe *more* spurious cases where we do not have a *small* trip count, and the DEBUG statement wouldn't be useful.
> At this point I think TC can also be zero if we don't know it, which is why we check if TC<2 below (i.e., if (TC==0 || TC == 1)).
>
> Printing out the value of TC when computed above can only aid debugging, afaics.
>
> In any case, I'm also ok leaving it here; you're also printing TC in computeFeasibleMaxVF() where it clamps MaxVF.
yes, the clamped value is printed in the debug statement.
================
Comment at: test/Transforms/LoopVectorize/X86/vector_max_bandwidth.ll:54
+; CHECK-LABEL: not_too_small_tc
+; CHECK-AVX1: LV: Selecting VF: 16.
+; CHECK-AVX2: LV: Selecting VF: 16.
----------------
Ayal wrote:
> The max possible vector width for this test on AVX1 is 16, so we're missing the point in checking its selected VF here. Suffice to CHECK-AVX2 only, whose max possible vector width is 32, as noted.
agreed, I'll remove the CHECK-AVX1 check.
https://reviews.llvm.org/D37702
More information about the llvm-commits
mailing list