[PATCH] D79162: [Analysis] TTI: Add CastContextHint for getCastInstrCost
Florian Hahn via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Wed Jul 15 10:54:12 PDT 2020
fhahn added inline comments.
================
Comment at: llvm/include/llvm/Analysis/TargetTransformInfo.h:1011
+ Interleave, ///< The cast is used with an interleaved load/store.
+ Reversed, ///< The cast is used with a reversed load/store.
+ };
----------------
dmgreen wrote:
> fhahn wrote:
> > Did you consider also excluding non load/store cast uses? For example, it might be worth to include arithmetic instructions into which casts can be folded, e.g. USUBL & co on AArch64.
> Yes.. That might be a possibility.
>
> AArch64TTIImpl::getCastInstrCost already has some code to check for isWideningInstruction, which should already catch a lot of these situations I think. That relies on the context instruction being correct, but I believe that will be correct most of the time currently. It's really the type of the load or store that is most often incorrect.
>
> I'm honestly not sure how well this would scale to many different operand kinds. I think in the long run we need something that makes costing multiple nodes together easier, if you imagine something like a trunc(shift(mul(sext(x), sext(y), 16), all being a single instruction! And with vplan making larger changes during vectorization it would ideally handle "hypothetical" instructions better without relying on "real" context instructions.
>
> But this patch is at least a little step that gets the load/store kinds more correct.
> AArch64TTIImpl::getCastInstrCost already has some code to check for isWideningInstruction, which should already catch a lot of these situations I think. That relies on the context instruction being correct, but I believe that will be correct most of the time currently. It's really the type of the load or store that is most often incorrect.
Yep, but I think it only works if the original IR does the widening already as part of the context instruction.
> I'm honestly not sure how well this would scale to many different operand kinds. I think in the long run we need something that makes costing multiple nodes together easier, if you imagine something like a trunc(shift(mul(sext(x), sext(y), 16), all being a single instructio
yeah, the isolated helpers reach a limit of usefulness. A way to cost arbitrary instruction trees would be very useful in many contexts (with some limits to the size I guess).
> But this patch is at least a little step that gets the load/store kinds more correct.
Sounds good to me, it might be good to just mention the reason for only limiting to memory cases initially somewhere (sorry if I missed that this is already said somewhere.
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D79162/new/
https://reviews.llvm.org/D79162
More information about the llvm-commits
mailing list