[PATCH] D120215: [LV] Invalidate widening decisions after maximizing vector bandwidth

Tue Mar 22 01:57:34 PDT 2022

dmgreen added inline comments.

================
Comment at: llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:5553
   ElementCount MaxVF = MaxVectorElementCount;
-  if (TTI.shouldMaximizeVectorBandwidth() ||
-      (MaximizeBandwidth && isScalarEpilogueAllowed())) {
+  if ((MaximizeBandwidth.getNumOccurrences() > 0 && MaximizeBandwidth) ||
+      TTI.shouldMaximizeVectorBandwidth()) {
----------------
fhahn wrote:
> > This slightly changes the way that the MaximizeVectorBandwidth option works to make it easier to test, always honouring the option if it is set.
> 
> IIUC this effectively ignores the default value of the option now, right? How does that make it easier to test?
I was mostly referring to it wanting to not check isScalarEpilogueAllowed any more. I think what we really want is for the option to always take precedence. I'll try and change it to that.

================
Comment at: llvm/test/Transforms/LoopVectorize/AArch64/maximize-bandwidth-invalidate.ll:13
+
+; COST: LV: Found an estimated cost of 3000000 for VF 2 For instruction:   %0 = load
+; COST: LV: Found an estimated cost of 3000000 for VF 4 For instruction:   %0 = load
----------------
fhahn wrote:
> Could you pre-commit the test and and update the diff to only show the difference?
Unfortunately it requires the command line option in order to test. The costs of the last 2 are 1 without the change, as it treats them as continuous loads, not masked loads. And it choses a vector factor of 8.

I could commit the option handling change first if you prefer. It would leave just the invalidateCostModelingDecisions call and the changes to these scores/codegen differences.

================
Comment at: llvm/test/Transforms/LoopVectorize/AArch64/maximize-bandwidth-invalidate.ll:94
+  %incdec.ptr = getelementptr inbounds i8, i8* %pInVec.addr.042, i64 1
+  %0 = load i8, i8* %pInVec.addr.042, align 1
+  %conv = sext i8 %0 to i32
----------------
fhahn wrote:
> The test is only checking the cost of `%0`, could the loop body be reduced to fewer loads/reductions?
It needs the whole thing to pick the wrong vector factor, which this is also testing. Without that it always chooses VF=1, even with the wrong scores.

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D120215/new/

https://reviews.llvm.org/D120215