[PATCH] D29540: Scalarization overhead estimation in getIntrinsicInstrCost() improved

Wed Mar 8 05:57:33 PST 2017

hfinkel added inline comments.

================
Comment at: include/llvm/Analysis/TargetTransformInfo.h:632
   /// \returns The cost of Intrinsic instructions. Types analysis only.
+  /// If ScalarizationCostPassed is UINT_MAX it will be computed based on types.
   int getIntrinsicInstrCost(Intrinsic::ID ID, Type *RetTy,
----------------
Please define what "it" is. This is the cost of scalarizing the arguments and the return value, right?

================
Comment at: include/llvm/Analysis/TargetTransformInfo.h:638
   /// \returns The cost of Intrinsic instructions. Analyses the real arguments.
+  /// Three cases of input are handled: 1. scalar instruction 2. vector
+  /// instruction 3. scalar instruction which is to be vectorized with VF.
----------------
  /// Three cases are handled:

================
Comment at: include/llvm/Analysis/TargetTransformInfo.h:640
+  /// instruction 3. scalar instruction which is to be vectorized with VF.
+  /// If VF > 1, it will be applied before analyzis.
   int getIntrinsicInstrCost(Intrinsic::ID ID, Type *RetTy,
----------------
"If VF > 1, it will be applied before analyzis." - I don't see how this adds value to the user of the interface? I don't see how else it could work or what it would mean if it worked otherwise.

================
Comment at: include/llvm/CodeGen/BasicTTIImpl.h:725
+    unsigned RetVF = (RetTy->isVectorTy() ? RetTy->getVectorNumElements() : 1);
+    assert ((RetVF == 1 || VF == 1) && "Can't multiply VF here.");
+
----------------
"Can't multiply VF here." does not really explain the problem. How about, "VF > 1 and RetVF is a vector type"?

================
Comment at: include/llvm/CodeGen/BasicTTIImpl.h:741
+      unsigned ScalarizationCost = UINT_MAX;
+      if (RetVF > 1 || VF > 1) {
+        ScalarizationCost = getScalarizationOverhead(RetTy, true, false);
----------------
I don't understand why `RetVF > 1` is here. If RetVF is not one, then VF needs to be one, and so we're not vectorizing. As a result, we're not scalarizing either, and so adding the scalarization cost associated with the return type seems odd.

================
Comment at: include/llvm/CodeGen/BasicTTIImpl.h:782
     default: {
       // Assume that we need to scalarize this intrinsic.
+      unsigned ScalarizationCost = ScalarizationCostPassed;
----------------
This interface seems a bit broken for how we're using it. Why are we assuming that we need to scalarize all intrinsics? (many target-specific intrinsics are really vector intrinsics - maybe only assume we need to scalarize the generic ones for which we don't have special handling?).

https://reviews.llvm.org/D29540