[PATCH] D79162: [Analysis] TTI: Add CastContextHint for getCastInstrCost

Wed Jul 15 10:54:12 PDT 2020

fhahn added inline comments.

================
Comment at: llvm/include/llvm/Analysis/TargetTransformInfo.h:1011
+    Interleave,    ///< The cast is used with an interleaved load/store.
+    Reversed,      ///< The cast is used with a reversed load/store.
+  };
----------------
dmgreen wrote:
> fhahn wrote:
> > Did you consider also excluding non load/store cast uses? For example, it might be worth to include arithmetic instructions into which casts can be folded, e.g. USUBL & co on AArch64.
> Yes.. That might be a possibility.
> 
> AArch64TTIImpl::getCastInstrCost already has some code to check for isWideningInstruction, which should already catch a lot of these situations I think. That relies on the context instruction being correct, but I believe that will be correct most of the time currently. It's really the type of the load or store that is most often incorrect.
> 
> I'm honestly not sure how well this would scale to many different operand kinds. I think in the long run we need something that makes costing multiple nodes together easier, if you imagine something like a trunc(shift(mul(sext(x), sext(y), 16),  all being a single instruction! And with vplan making larger changes during vectorization it would ideally handle "hypothetical" instructions better without relying on "real" context instructions.
> 
> But this patch is at least a little step that gets the load/store kinds more correct.
> AArch64TTIImpl::getCastInstrCost already has some code to check for isWideningInstruction, which should already catch a lot of these situations I think. That relies on the context instruction being correct, but I believe that will be correct most of the time currently. It's really the type of the load or store that is most often incorrect.

Yep, but I think it only works if the original IR does the widening already as part of the context instruction.

> I'm honestly not sure how well this would scale to many different operand kinds. I think in the long run we need something that makes costing multiple nodes together easier, if you imagine something like a trunc(shift(mul(sext(x), sext(y), 16), all being a single instructio

yeah, the isolated helpers reach a limit of usefulness. A way to cost arbitrary instruction trees would be very useful in many contexts (with some limits to the size I guess).

> But this patch is at least a little step that gets the load/store kinds more correct.

Sounds good to me, it might be good to just mention the reason for only limiting to memory cases initially somewhere (sorry if I missed that this is already said somewhere.

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D79162/new/

https://reviews.llvm.org/D79162