[PATCH] D15677: Cost for Gather and Scatter operations.
Ayal Zaks via llvm-commits
llvm-commits at lists.llvm.org
Tue Dec 22 12:18:53 PST 2015
Ayal added inline comments.
================
Comment at: ../include/llvm/Analysis/TargetTransformInfo.h:464
@@ +463,3 @@
+ /// \p DataTy - a vector type of the data to be loaded or stored
+ /// \Ptr - poiner [or vector of pointers] - address[es] in memory
+ /// \p Masked - true when the memory access is predicated
----------------
poin[t]er
================
Comment at: ../include/llvm/Analysis/TargetTransformInfo.h:833
@@ +832,3 @@
+ ArrayRef<Value *> Args) override {
+ return Impl.getIntrinsicInstrCost(ID, Args, RetTy);
+ }
----------------
keep order of arguments "ID, RetTy, Args" similar to "ID, RetTy, Tys"?
================
Comment at: ../include/llvm/CodeGen/BasicTTIImpl.h:605
@@ +604,3 @@
+ Value *Mask = Args[2];
+ bool MaskedOp = !isa<Constant>(Mask);
+ unsigned Alignment = cast<ConstantInt>(Args[1])->getZExtValue();
----------------
Maybe better call it ConstMasked or VarMasked instead of MaskedOp (throughout)
================
Comment at: ../lib/Target/X86/X86TargetTransformInfo.cpp:1294
@@ +1293,3 @@
+
+ auto getIndexSizeInBits = [](Value *Ptr, const DataLayout& DL) {
+ unsigned IndexSize = DL.getPointerSizeInBits();
----------------
Document the patterns that fold into 32bit indexed GS.
================
Comment at: ../lib/Target/X86/X86TargetTransformInfo.cpp:1296
@@ +1295,3 @@
+ unsigned IndexSize = DL.getPointerSizeInBits();
+ if (IndexSize == 64 && isa<GetElementPtrInst>(Ptr)) {
+ unsigned NumOfVarIndices = 0;
----------------
May be better to first check if IndexSize < 64 or dyn_cast is null and return 32, to reduce subsequent indentation.
================
Comment at: ../lib/Target/X86/X86TargetTransformInfo.cpp:1300
@@ +1299,3 @@
+ Value *Ptrs = GEP->getPointerOperand();
+ if (isa<VectorType>(Ptrs->getType()) && !getSplatValue(Ptrs))
+ return IndexSize;
----------------
check isVectorTy() instead?
================
Comment at: ../lib/Target/X86/X86TargetTransformInfo.cpp:1316
@@ +1315,3 @@
+ }
+ if (NumOfVarIndices > 1)
+ return IndexSize; // 64
----------------
check this when bumping NumOfVarIndices. E.g., if (++NumOfVarIndices > 1)
================
Comment at: ../lib/Target/X86/X86TargetTransformInfo.cpp:1329-1330
@@ +1328,4 @@
+ // Handle splitting of vector of pointers
+ Type *SplitSrcTy = VectorType::get(SrcVTy->getScalarType(), VF / IdxsLT.first);
+ return SplitFactor * getGSVectorCost(Opcode, SplitSrcTy, Ptr);
+ }
----------------
Divide VF by SplitFactor instead of IdxsLT.first?
Else assert IdxsLT.first > 1? Otherwise recursion may spin.
================
Comment at: ../lib/Target/X86/X86TargetTransformInfo.cpp:1333-1335
@@ +1332,5 @@
+
+ const int GatherOverhead = 6;
+ const int ScatterOverhead = 10;
+ int Cost = VF + ((Opcode == Instruction::Load) ? GatherOverhead : ScatterOverhead);
+
----------------
Some explanation for the formula and overheads?
================
Comment at: ../lib/Target/X86/X86TargetTransformInfo.cpp:1337
@@ +1336,3 @@
+
+ Type * MaskTy = VectorType::get(Type::getInt1Ty(getGlobalContext()), VF);
+ std::pair<int, MVT> MaskLT = TLI->getTypeLegalizationCost(DL, MaskTy);
----------------
space after *
================
Comment at: ../lib/Target/X86/X86TargetTransformInfo.cpp:1339
@@ +1338,3 @@
+ std::pair<int, MVT> MaskLT = TLI->getTypeLegalizationCost(DL, MaskTy);
+ std::pair<int, MVT> DataLT = TLI->getTypeLegalizationCost(DL, SrcVTy);
+ bool LegalizeMask =
----------------
DataLT = SrcLT?
assert MaskLT.first == 1?
================
Comment at: ../lib/Target/X86/X86TargetTransformInfo.cpp:1392
@@ +1391,3 @@
+ for (unsigned i = 0; i < VF; ++i)
+ // Insertion the results of all slalar loads to one vector
+ InsertExtractCost +=
----------------
Typos. Suggest: // Add the cost of inserting each scalar load into the vector.
================
Comment at: ../lib/Target/X86/X86TargetTransformInfo.cpp:1397
@@ +1396,3 @@
+ for (unsigned i = 0; i < VF; ++i)
+ // Extract elements from data vector for scalar stores
+ InsertExtractCost +=
----------------
Suggest: // Add the cost of extracting each element out to a scalar store.
================
Comment at: ../lib/Target/X86/X86TargetTransformInfo.cpp:1412
@@ +1411,3 @@
+ "Masked Gather / Scatter should be legal on this target");
+ Type *PtrTy = isa<VectorType>(Ptr->getType()) ? cast<VectorType>(Ptr->getType())->getElementType() :
+ Ptr->getType();
----------------
Is 'cast' needed? See pattern used below in getGatherScatterOpCost, or use dyn_cast instead?
Repository:
rL LLVM
http://reviews.llvm.org/D15677
More information about the llvm-commits
mailing list