[PATCH] TTI: Add getCallInstrCost.

Tue Mar 10 12:27:41 PDT 2015

================
Comment at: include/llvm/CodeGen/BasicTTIImpl.h:681
@@ +680,3 @@
+                                    RetTy->getVectorNumElements()))
+      return ScalarCost;
+
----------------
hfinkel wrote:
> mzolotukhin wrote:
> > mzolotukhin wrote:
> > > hfinkel wrote:
> > > > mzolotukhin wrote:
> > > > > hfinkel wrote:
> > > > > > That's not quite right. You still need to call getCallInstrCost with the associated vector types.
> > > > > Hi Hal,
> > > > > 
> > > > > I don't think I get it. When the types are vector, we first call getCallInstrCost for corresponding scalar types and then return `ScalarCalls*ScalarCost+ScalarizationCost` for not vectorizable functions and `ScalarCost` for vectorizable ones. If the types are scalar, we just return 10. What's missing here?
> > > > > 
> > > > > Could you please elaborate your point a little?
> > > > The key problem here is that this logic does not belong inside the cost-model interface. What the cost model is supposed to provide is an interface that can be used for the vectorizer (and other clients) to ask the target, "Given this instruction on values of these types, please provide an estimate of the relative reciprocal throughput." The logic of how to transition from scalar instructions to vector instructions must live in the vectorizer, not inside the cost model. You're assuming too much about the actions that will be taken by the user of the interface by using the function-vectorization logic inside the most model to provide an estimate based on, not only the provided instruction and types, but also how the vectorizer will likely replace the queried function call with some other one.
> > > > 
> > > > We might not want to create external function declarations for each function we might wish to query as part of the interface, so we'll either need to allow for F to be nullptr, or change the interface.
> > > > 
> > > > What we're querying here is the cost of the function call, as provided, any logic related to vectorization needs to live inside the vectorizer, only the vectorizer knows how it will vectorize, and how to formulate any scalarization cost. The only scalarization costs that the TTI interface should know about, are those that will be incurred by the type legalization process in CodeGen -- those introduced at the IR level by TTI's users, need to be estimated by TTI's users based on how that scalarization is actually done.
> > > > 
> > > Ok, I see what you mean now, thank you for the explanation!
> > > 
> > > Then my plan is to leave in TTI only trivial getCallCost, which will always return 10 (in default implementation). The current logic, which takes into account scalarization costs, will be moved to the  vectorizer almost unchanged. With such changes back-ends could control the vectorizer by setting costs of Call, InsertElement and ExtractElement instructions. Does it sound good?
> > Actually, we use getScalarizationCost in other places in BasicTTIImple too: e.g. in getArithmeticInstrCost, getCastInstrCost, and others - do you think we need to change these places as well?
> Yes, that sounds like the right way to do it; thanks!
> 
> The other places in BasicTTIImpl use getScalarizationCost to model what happens during type/operation legalization in CodeGen, so that's different (and should stay).
> 
Hi Hal,

I looked at what we need to change to make it this way:
1) We need to implement getScalarizationCost outside TTI, so that it can be used in LoopVectorizer, SLPVectorizer, IndVarSimplify, CodeGenPrepare and other passes that currently might use getArithInstrCost/getCastInstrCost/etc. with a vector type.
2) We need to change all these uses with
```
if (Ty->isVectorTy())
  cost = getScalarizationCost(Ty) + getInstrCost(Ty)
else
  cost = getInstrCost(Ty)
```
3) We need to make sure the we don't actually change costs on any platform.

All of these seem to me like a pretty massive, and probably not so elegant change - we need to change a lot of places, and we'll introduce a lot of very similar code. Also, it might be true that the vectorizer might want to determine a specific way to scalarize a value, but other users of getArithInstrCost/etc. might don't care about it that much. E.g. I don't think it's worth adding this logic to CodeGenPrepare - it'll be the same as we have now in TTI anyway (and we'll need to keep it both in and outside TTI).

Did I understand what you suggest correctly at all?

http://reviews.llvm.org/D8094

EMAIL PREFERENCES
  http://reviews.llvm.org/settings/panel/emailpreferences/