[PATCH] D79162: [Analysis] TTI: Add CastContextHint for getCastInstrCost

Fri May 15 04:30:02 PDT 2020

dmgreen added a comment.

In D79162#2035811 <https://reviews.llvm.org/D79162#2035811>, @Pierre-vh wrote:

> - Removing instruction from calls to `getCastInstrCost` in the LoopVectorizer.

Although I would agree that in theory this would make a lot of sense,  there are other places that are currently using the context instruction for things that are not modeled here. And just removing them from the vectorier will likely lead to regressions in practice. They won't be tested because the costmodel is usually tested through non-vectorizer tests (testing the costmodel directly), and this change will now treat the vectorizer differently to those tests.

>From a quick look:

- aarch64 looks for "isWideningInstruction".
- arm/neon can now do the same as of a recent patch.
- systemz seems to use it for loads and.. something to do with compares?

I would recommend that for the moment we keep the context instruction in place from the vectorizer. The alternative would be to try and replace all needed modelling with hints or extra parameters, but that sounds like it will get very messy quite quickly. If only the opcode of the surrounding instructions is used, it will likely be "correct enough" for most cases (I think). In the long run as the vectorizer learns to transform code more, and vplan starts to learn new tricks this is more likely to break down, but I think that will need larger changes to the costmodelling anyway. What we have here is at least well defined (the type of the load/store) and is known to be fixing something that is incorrectly used at the moment.

In the long run I would like to see something that really tries to cost multiple instructions at the same time. If we have trunc(shl(mul(sext, sext))) and we know in the backend that we can convert that to a vmulh, it's going to be next to impossible to sensible costmodel that without something that looks at the entire tree and gives it a single cost. You don't want to have a sextend looking at it's uses uses uses to see if the whole thing together makes something that is cheap vs something that is expensive. Maybe that's not a great example but hopefully you can see my point. I imagine it might need a better way to get context instructions for things that don't exist too. From vplan recipes or runtime unrolled loops or the like. It would be good to be able to get a fake instruction analogue without being tied specifically to the original IR. That will all need a lot of careful design though.

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D79162/new/

https://reviews.llvm.org/D79162