[PATCH] D29540: Scalarization overhead estimation in getIntrinsicInstrCost() improved

Thu Mar 9 00:20:04 PST 2017

jonpa added inline comments.

================
Comment at: include/llvm/Analysis/TargetTransformInfo.h:632
   /// \returns The cost of Intrinsic instructions. Types analysis only.
+  /// If ScalarizationCostPassed is UINT_MAX it will be computed based on types.
   int getIntrinsicInstrCost(Intrinsic::ID ID, Type *RetTy,
----------------
hfinkel wrote:
> Please define what "it" is. This is the cost of scalarizing the arguments and the return value, right?
yes

================
Comment at: include/llvm/Analysis/TargetTransformInfo.h:640
+  /// instruction 3. scalar instruction which is to be vectorized with VF.
+  /// If VF > 1, it will be applied before analyzis.
   int getIntrinsicInstrCost(Intrinsic::ID ID, Type *RetTy,
----------------
hfinkel wrote:
> "If VF > 1, it will be applied before analyzis." - I don't see how this adds value to the user of the interface? I don't see how else it could work or what it would mean if it worked otherwise.
ok - removed that sentence :-)

================
Comment at: include/llvm/CodeGen/BasicTTIImpl.h:725
+    unsigned RetVF = (RetTy->isVectorTy() ? RetTy->getVectorNumElements() : 1);
+    assert ((RetVF == 1 || VF == 1) && "Can't multiply VF here.");
+
----------------
hfinkel wrote:
> "Can't multiply VF here." does not really explain the problem. How about, "VF > 1 and RetVF is a vector type"?
sure

================
Comment at: include/llvm/CodeGen/BasicTTIImpl.h:741
+      unsigned ScalarizationCost = UINT_MAX;
+      if (RetVF > 1 || VF > 1) {
+        ScalarizationCost = getScalarizationOverhead(RetTy, true, false);
----------------
hfinkel wrote:
> I don't understand why `RetVF > 1` is here. If RetVF is not one, then VF needs to be one, and so we're not vectorizing. As a result, we're not scalarizing either, and so adding the scalarization cost associated with the return type seems odd.
My idea was that this function can be called from different contexts ("three cases"). 

The LoopVectorizer can pass the scalar instruction and VF > 1.
CostModel can pass a vector instruction, RetVF > 1 (and VF argument defaults to 1).

Both cases should produce the same return value.

Am I missing something?

================
Comment at: include/llvm/CodeGen/BasicTTIImpl.h:782
     default: {
       // Assume that we need to scalarize this intrinsic.
+      unsigned ScalarizationCost = ScalarizationCostPassed;
----------------
hfinkel wrote:
> This interface seems a bit broken for how we're using it. Why are we assuming that we need to scalarize all intrinsics? (many target-specific intrinsics are really vector intrinsics - maybe only assume we need to scalarize the generic ones for which we don't have special handling?).
> 
I was assuming this made sense in the 'default:' case at least.

How scalarization cost is computed, and when it is done are two separate issues, so perhaps that can be handled later with a separate patch?

https://reviews.llvm.org/D29540