[llvm] [AArch64][LV][SLP] Vectorizers use getFRemInstrCost for frem costs (PR #82488)

Wed Feb 28 09:12:04 PST 2024

paschalis-mpeis wrote:

> Alexey: That should be fine, what's the dangerous in it?

Currently,`frem` relies on `ReplaceWithVeclib` pass, so if for whatever reason that pass does not ran, then codegen would crash.

> Paul: What if one opcode needs TLI for a different reason? all of a sudden all existing callers are entered into the contract (FREM is guaranteed to be transformed into a math call) without ensuring that's actually the case

If we encapsulate this functionality in `getArithmeticInstrCost`, we are adding existing callers into this 'contract'.
What this means, is that they could potentially pass at some point in the future a TLI object, returning the updated costs in a scenario where `ReplaceWithVeclib` happens to **not** ran.

So, until CodeGen gains support for emitting vectorized frem, we want to be very explicit about it.

### Examples:
1) A new pass might calls `getArithmeticInstrCost` and passes TLI there, which may happen to have veclib support for frem. 

2) Similarly, a maintaiiner might decide to pass TLI to an existing callsite of `getArithmeticInstrCost`, getting again the updated costs.

In both examples, the return costs will favor replacing to a vectorized frem, however, it could be at a point in the pipeline where `ReplaceWithVeclib` does not ran, causing codegen to crash soon after.

---

## What's next:
Ideally, we would also prefer to have this merged into `getArithmeticInstrCost`, but we are reluctant to do so until Codegen learns to deal with `frem` in all cases.

So until then, we insist on being very explicit on those cases that knowingly call the special `getFRemInstrCost` (SLP & Loop vectorizers). At this time, I could add a `TODO` note of such an intention to `getFRemInstrCost`, and in the future, given such codegen support, we'd be happy to fully abstract it away inside TTI.

@alexey-bataev  while not ideal, does the above make things clearer? Also, given we proceed with this approach for this patch, what changes would you require?

https://github.com/llvm/llvm-project/pull/82488