[PATCH] D90687: [LV] Ignore VF hint when unsafe

Tue Nov 3 12:48:33 PST 2020

c-rhodes added inline comments.

================
Comment at: llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:5238
+  if (!VectorizerParams::isVFForced() && UserVF &&
+      UserVF > computeFeasibleMaxVF(TC)) {
+    reportVectorizationFailure(
----------------
Meinersbur wrote:
> Meinersbur wrote:
> > fhahn wrote:
> > > dmgreen wrote:
> > > > I don't think computeFeasibleMaxVF is the maximum safe width. It does a lot of things and is largely based on the backend vector widths. I'm guessing that's why so many tests are changing.
> > > > 
> > > > It should probably be based on Legal.getMaxSafeRegisterWidth()?
> > > > It should probably be based on Legal.getMaxSafeRegisterWidth()?
> > > 
> > > computeFeasibleMaxVF should only return legal vectorization factors (using getMaxSafeRegisterWidht) I think. Given that we are free to ignore the hint, if it is not useful, why not just use the largest safe vectorization factors instead of bailing out?
> > > 
> > > Unfortunately the handling of UserVF is a bit of a mess, but I think it might be preferable to clamp the UserVF to the maximum vectorization factor in the caller of computeMaxVF where UserVF is actually used.
> > `Legal.getMaxSafeRegisterWidth()` is called within `computeFeasibleMaxVF`. With `computeFeasibleMaxVF` considering additional architecture concerns, can these be just ignored?
> With `computeFeasibleMaxVF` already called later, could you store its result for both uses?
> 
> I suggest to move `UserVF ? UserVF : computeFeasibleMaxVF(TC)` which is duplicated multiple times below before this condition. Such as:
> ```
> auto MaxVF = computeFeasibleMaxVF(TC);
> if (UserVF) {
>   if (UserVF > MaxVF) {
>     ...
>   }
>   MaxVF = UserFC;
> }
> ```
> 
> > It should probably be based on Legal.getMaxSafeRegisterWidth()?
> 
> computeFeasibleMaxVF should only return legal vectorization factors (using getMaxSafeRegisterWidht) I think. Given that we are free to ignore the hint, if it is not useful, why not just use the largest safe vectorization factors instead of bailing out?

Yeah that's right, `computeFeasibleMaxVF` is based on `Legal->getMaxSafeRegisterWidth()` and bounded by the widest register according to the TTI.

> 
> Unfortunately the handling of UserVF is a bit of a mess, but I think it might be preferable to clamp the UserVF to the maximum vectorization factor in the caller of computeMaxVF where UserVF is actually used.

Sure, clamping sounds good to me and @Meinersbur suggested this as well, I'll update the patch.

================
Comment at: llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:5238
+  if (!VectorizerParams::isVFForced() && UserVF &&
+      UserVF > computeFeasibleMaxVF(TC)) {
+    reportVectorizationFailure(
----------------
c-rhodes wrote:
> Meinersbur wrote:
> > Meinersbur wrote:
> > > fhahn wrote:
> > > > dmgreen wrote:
> > > > > I don't think computeFeasibleMaxVF is the maximum safe width. It does a lot of things and is largely based on the backend vector widths. I'm guessing that's why so many tests are changing.
> > > > > 
> > > > > It should probably be based on Legal.getMaxSafeRegisterWidth()?
> > > > > It should probably be based on Legal.getMaxSafeRegisterWidth()?
> > > > 
> > > > computeFeasibleMaxVF should only return legal vectorization factors (using getMaxSafeRegisterWidht) I think. Given that we are free to ignore the hint, if it is not useful, why not just use the largest safe vectorization factors instead of bailing out?
> > > > 
> > > > Unfortunately the handling of UserVF is a bit of a mess, but I think it might be preferable to clamp the UserVF to the maximum vectorization factor in the caller of computeMaxVF where UserVF is actually used.
> > > `Legal.getMaxSafeRegisterWidth()` is called within `computeFeasibleMaxVF`. With `computeFeasibleMaxVF` considering additional architecture concerns, can these be just ignored?
> > With `computeFeasibleMaxVF` already called later, could you store its result for both uses?
> > 
> > I suggest to move `UserVF ? UserVF : computeFeasibleMaxVF(TC)` which is duplicated multiple times below before this condition. Such as:
> > ```
> > auto MaxVF = computeFeasibleMaxVF(TC);
> > if (UserVF) {
> >   if (UserVF > MaxVF) {
> >     ...
> >   }
> >   MaxVF = UserFC;
> > }
> > ```
> > 
> > > It should probably be based on Legal.getMaxSafeRegisterWidth()?
> > 
> > computeFeasibleMaxVF should only return legal vectorization factors (using getMaxSafeRegisterWidht) I think. Given that we are free to ignore the hint, if it is not useful, why not just use the largest safe vectorization factors instead of bailing out?
> 
> Yeah that's right, `computeFeasibleMaxVF` is based on `Legal->getMaxSafeRegisterWidth()` and bounded by the widest register according to the TTI.
> 
> > 
> > Unfortunately the handling of UserVF is a bit of a mess, but I think it might be preferable to clamp the UserVF to the maximum vectorization factor in the caller of computeMaxVF where UserVF is actually used.
> 
> Sure, clamping sounds good to me and @Meinersbur suggested this as well, I'll update the patch.
> I don't think computeFeasibleMaxVF is the maximum safe width. It does a lot of things and is largely based on the backend vector widths. I'm guessing that's why so many tests are changing.

I did find for the X86 tests that have changed it was because of a backend vector width of 128-bit, I expect these changes will still be required when clamping the UserVF to the maximum vectorization factor as suggested.

================
Comment at: llvm/test/Transforms/LoopVectorize/X86/libm-vector-calls.ll:192

-!71 = distinct !{!71, !72, !73}
-!72 = !{!"llvm.loop.vectorize.width", i32 4}
----------------
fhahn wrote:
> Why those test changes?
> Why those test changes?

The maximum VF=2, it's computed as `WidestRegister / WidestType` where the the widest register is 128-bit.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D90687/new/

https://reviews.llvm.org/D90687