[PATCH] D144413: [InstCombine] Extend SVEVectorFuseMulAddSub to support newly added "undef" intrinsics.

Fri Mar 10 03:04:27 PST 2023

paulwalker-arm added inline comments.

================
Comment at: llvm/lib/Target/AArch64/AArch64TargetTransformInfo.cpp:1622-1625
+  case Intrinsic::aarch64_sve_fadd_u:
+    return instCombineSVEVectorFuseMulAddSub<Intrinsic::aarch64_sve_fmul_u,
+                                             Intrinsic::aarch64_sve_fmla_u>(
+        IC, II, true);
----------------
MattDevereau wrote:
> Is this case separate from `instCombineSVEVectorAdd` because the "undef" variant can't combine to fmla_u unless both the fmul and fadd are of the _u variants? Or because this case can't benefit from combines in `instCombineSVEVectorBinOp`?
The formerish. `fadd_m(pg, a, fmul_u(pg, b, c))` expects the inactive elements to come from `a`, which `fmla_u` does not guarantee.  It's worth pointing out the opposite is a valid transformation (i.e. `fadd_u(pg, a, fmul_m(pg, b, c))  -->  fmla_u(a, b, c)` but that's new and I have half a thought it'll be better to soften the `fmul_m` to `fmul_u` rather than jumping straight to `fmla_u`.

This does mean we're not getting the benefit of `instCombineSVEVectorBinOp` but here my plan is to rewrite `m` instrinsics that take an all active predicate to their equivalent `u` intrinsic, to minimise duplication.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D144413/new/

https://reviews.llvm.org/D144413