[PATCH] D132966: [TTI] Allow passing ArrayRef of context instructions (NFC).

Wed Aug 31 10:09:21 PDT 2022

ABataev added a comment.

In D132966#3761614 <https://reviews.llvm.org/D132966#3761614>, @fhahn wrote:

> In D132966#3761479 <https://reviews.llvm.org/D132966#3761479>, @ABataev wrote:
>
>> In D132966#3761409 <https://reviews.llvm.org/D132966#3761409>, @fhahn wrote:
>>
>>> In D132966#3760480 <https://reviews.llvm.org/D132966#3760480>, @dmgreen wrote:
>>>
>>>> This seems to mess up the interface to TTI quite a lot. Are there any other cases than the SLP vectorizer where se would pass a vector of Instructions?
>>>
>>> Yeah the new argument is specifically to support SLP's use case. I don't think other passes are in a similar situation at the moment. There's also a version that keeps the logic in SLP: D132872 <https://reviews.llvm.org/D132872>, but @ABataev  argued to have this generally available.
>>
>> Maybe add a specific function which returns bool if preferable to use FMA instead?
>
> I think the issue here is that is not as simple as asking a boolean question.
>
> We need to adjust both the scalar and vector costs, depending on whether either can use FMAs. I think if we support this in TTI, then it should be integrated into the existing APIs.

Agree, that's why I thought it is better to make it part of TTI.

> If we add a new interface just geared at the SLP use case, general TTI users won't benefit anyways and then IMO it would be better to keep SLP logic in `SLPVectorizer.cpp`, at least initially.

We already have SLP specific functions (at least for now) in TTI.

>>>> The CxtI only has to be a context. It gets a bit fuzzy, but could we just pass the first instruction if it is similar enough to the other instructions in the TreeEntry? It looks like the first item is already passed in at the moment.
>>>
>>> I think all instructiosn in a TreeEntry should be very similar in almost all cases (same opcode). But here we need to specifically look at the users to determine if the users of all instructions in the bundle will allow fusion.
>>>
>>> Now while spelling this out, maybe we could instead fuse elegible FMUL + FADD/FSUB TreeEntry nodes directly to a single FMULADD/SUB TreeEntry intead of checking for fusion opportunities for the vector version? @ABataev  do you think that would be easily do-able?
>>
>> Everything is doable, it is just a question of time. Need to adjust the cost somehow, add a flag (probably!) to the node(s) for possible "FMAsation" and change the codegen to emit FMA instead of fmul+fadd/fsub.
>
> Right, the question is what the best path forward is to incrementally improve the situation without adding too much churn until we know the cost-based decision works well for a range of targets.

The cost still needs to be adjusted, before we do actual replacement.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D132966/new/

https://reviews.llvm.org/D132966