[PATCH] D114697: [X86][Costmodel] `getInterleavedMemoryOpCostAVX512()`: masked load can not be folded into a shuffle

Roman Lebedev via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Mon Nov 29 04:15:17 PST 2021


lebedev.ri created this revision.
lebedev.ri added reviewers: RKSimon, pengfei.
lebedev.ri added a project: LLVM.
Herald added a subscriber: hiraditya.
lebedev.ri requested review of this revision.

The mask on the shuffle is for the output, not the input.


Repository:
  rG LLVM Github Monorepo

https://reviews.llvm.org/D114697

Files:
  llvm/lib/Target/X86/X86TargetTransformInfo.cpp


Index: llvm/lib/Target/X86/X86TargetTransformInfo.cpp
===================================================================
--- llvm/lib/Target/X86/X86TargetTransformInfo.cpp
+++ llvm/lib/Target/X86/X86TargetTransformInfo.cpp
@@ -5275,7 +5275,8 @@
   auto *SingleMemOpTy = FixedVectorType::get(VecTy->getElementType(),
                                              LegalVT.getVectorNumElements());
   InstructionCost MemOpCost;
-  if (UseMaskForCond || UseMaskForGaps)
+  bool UseMaskedMemOp = UseMaskForCond || UseMaskForGaps;
+  if (UseMaskedMemOp)
     MemOpCost = getMaskedMemoryOpCost(Opcode, SingleMemOpTy, Alignment,
                                       AddressSpace, CostKind);
   else
@@ -5286,7 +5287,7 @@
   MVT VT = MVT::getVectorVT(MVT::getVT(VecTy->getScalarType()), VF);
 
   InstructionCost MaskCost;
-  if (UseMaskForCond || UseMaskForGaps) {
+  if (UseMaskedMemOp) {
     APInt DemandedLoadStoreElts = APInt::getZero(VecTy->getNumElements());
     for (unsigned Index : Indices) {
       assert(Index < Factor && "Invalid index for interleaved memory op");
@@ -5349,9 +5350,10 @@
         NumOfLoadsInInterleaveGrp;
 
     // About a half of the loads may be folded in shuffles when we have only
-    // one result. If we have more than one result, we do not fold loads at all.
+    // one result. If we have more than one result, or the loads are masked,
+    // we do not fold loads at all.
     unsigned NumOfUnfoldedLoads =
-        NumOfResults > 1 ? NumOfMemOps : NumOfMemOps / 2;
+        UseMaskedMemOp || NumOfResults > 1 ? NumOfMemOps : NumOfMemOps / 2;
 
     // Get a number of shuffle operations per result.
     unsigned NumOfShufflesPerResult =


-------------- next part --------------
A non-text attachment was scrubbed...
Name: D114697.390302.patch
Type: text/x-patch
Size: 1678 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20211129/d0c9d610/attachment.bin>


More information about the llvm-commits mailing list