[PATCH] D144375: [SystemZ, SLP] Enable FP horizontal reductions and fix SLP cost computation.

Alexey Bataev via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Thu Feb 23 10:55:38 PST 2023


ABataev added inline comments.


================
Comment at: llvm/include/llvm/CodeGen/BasicTTIImpl.h:1250-1254
+    if (thisT()->supportsEfficientVectorElementLoadStore())
+      for (unsigned Idx = 0; Idx < VL.size(); ++Idx)
+        if (DemandedElts[Idx] && isa<LoadInst>(VL[Idx]))
+          Cost -= thisT()->getVectorInstrCost(Instruction::InsertElement, Ty,
+                                              CostKind, Idx, nullptr, nullptr);
----------------
Better to exclude DemandedElts[Idx] before calling getScalarizationOverhead for such loads rather than subtract the cost.


================
Comment at: llvm/lib/Target/SystemZ/SystemZTargetTransformInfo.cpp:1030-1051
+  if (Ty->isIntOrIntVectorTy(64)) {
+    // VLVGP will insert two GPRs with one instruction, while VLE will load
+    // an element directly with no extra cost. Take special care for cases
+    // where one element is loaded with VLE and the other one still needs an
+    // insertion.
+    assert(VL.size() == Ty->getNumElements() && "Ty does not match the values.");
+    TTI::TargetCostKind CostKind = TTI::TCK_RecipThroughput;
----------------
I don't see the call for getScalariationOverhead. Also, can you try to implement overloaded version of getScalarizationOverhead, that knows how to handle it, instead of moving getGatherCost to TTI ?


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D144375/new/

https://reviews.llvm.org/D144375



More information about the llvm-commits mailing list