[PATCH] D53668: [LV] Support vectorization of interleave-groups that require an epilog under optsize using masked wide loads

Ayal Zaks via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Tue Oct 30 01:58:37 PDT 2018


Ayal accepted this revision.
Ayal added a comment.
This revision is now accepted and ready to land.

> Also added a test with stride 3.

Thanks, one was indeed needed.
LGTM, with few minor additional optional suggestions.



================
Comment at: include/llvm/Analysis/VectorUtils.h:425
+  /// is no other means (such as masking) to support them. This can happen when
+  /// we optimize for size and don't allow creating a scalar epilogue.
   void invalidateGroupsRequiringScalarEpilogue();
----------------
The "if there is no other means..." is now part of the "This can happen when". I.e.,
"This can happen when optimizing for size forbids a scalar epilogue, and the gap cannot be filtered by masking the load/store."


================
Comment at: include/llvm/CodeGen/BasicTTIImpl.h:850
     // We only scale the cost of loads since interleaved store groups aren't
     // allowed to have gaps.
     if (Opcode == Instruction::Load && VecTySize > VecTyLTSize) {
----------------
This scaling works just as well for gap-masked loads, right?
(Another place to remember extending when introducing support for gap-masked stores, btw)


================
Comment at: include/llvm/CodeGen/BasicTTIImpl.h:921
 
-    if (!IsMasked)
+    if (!UseMaskForCond)
       return Cost;
----------------
Comment here that UseMaskForGaps alone does not add to Cost, because its mask is uniform. Unlike below where it adds the cost of And-ing the two masks.


================
Comment at: lib/Analysis/VectorUtils.cpp:959
         << "LV: Invalidate candidate interleaved group due to gaps that "
-           "require a scalar epilogue.\n");
+           "require a scalar epilogue (not allowed under optsize) or masking "
+           "of interleave-groups (not enabled). \n");
----------------
"or masking ..." >> "and cannot be masked (not enabled)."


================
Comment at: lib/Transforms/Vectorize/LoopVectorize.cpp:1243
+  /// or as a peel-loop to handle gaps in interleave-groups.
+  /// Under optsize we don't allow any iterations to execute in the scalar
+  /// loop. 
----------------
"Under optsize" and when the trip count is very small "we don't ..."


================
Comment at: lib/Transforms/Vectorize/LoopVectorize.cpp:2077
       Instruction *NewLoad;
-      if (IsMaskRequired) {
-        auto *Undefs = UndefValue::get(Mask[Part]->getType());
-        auto *RepMask = createReplicatedMask(Builder, InterleaveFactor, VF);
-        Value *ShuffledMask = Builder.CreateShuffleVector(
-            Mask[Part], Undefs, RepMask, "interleaved.mask");
-        NewLoad = Builder.CreateMaskedLoad(NewPtrs[Part], Group->getAlignment(), 
-                                           ShuffledMask, UndefVec,
-                                           "wide.masked.vec");
+      if (IsMaskRequired || MaskForGaps) {
+        assert(useMaskedInterleavedAccesses(*TTI) &&
----------------
Rename `IsMaskRequired` to `[Is]MaskForCondRequired`?


================
Comment at: lib/Transforms/Vectorize/LoopVectorize.cpp:2084
+          auto *RepMask = createReplicatedMask(Builder, InterleaveFactor, VF);
+          Value *ShuffledMask = Builder.CreateShuffleVector(
+              Mask[Part], Undefs, RepMask, "interleaved.mask");
----------------
Rename `ShuffledMask` to `MaskForCond`?


================
Comment at: lib/Transforms/Vectorize/LoopVectorize.cpp:4393
+  // enabled it, because otherwise it either wouldn't have been created or
+  // it should have been invalidated by the costModel.
+  assert(useMaskedInterleavedAccesses(TTI) &&
----------------
"costModel" >> "cost model", or "CostModel"


================
Comment at: lib/Transforms/Vectorize/LoopVectorize.cpp:4649
 
+  IsScalarEpilogueAllowed = !OptForSize;
+
----------------
It's indeed good to record in CostModel the constraint forbidding a scalar epilogue, instead of passing an overloaded and abused OptForSize parameter around. It should be recorded at the outset, say at the constructor of CostModel, to be consistent; and isScalarEpilogueAllowed() should be used inside the CostModel instead of OptForSize throughout. This NFC refactoring can be done separately, before/after this patch.


================
Comment at: test/Transforms/LoopVectorize/X86/x86-interleaved-accesses-masked-group.ll:89
+; When enable-masked-interleaved-access is disabled the interleave-group is
+; invalidated, so we end up scalarizing.
 ; (Before the fix that this test checks, we used to create an epilogue despite
----------------
Add checks for this scalarizing when enable-masked-interleaved-access is disabled, or comment that it is already checked above?


https://reviews.llvm.org/D53668





More information about the llvm-commits mailing list