[PATCH] D144413: [InstCombine] Extend SVEVectorFuseMulAddSub to support newly added "undef" intrinsics.
Paul Walker via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Fri Mar 10 03:04:27 PST 2023
paulwalker-arm added inline comments.
================
Comment at: llvm/lib/Target/AArch64/AArch64TargetTransformInfo.cpp:1622-1625
+ case Intrinsic::aarch64_sve_fadd_u:
+ return instCombineSVEVectorFuseMulAddSub<Intrinsic::aarch64_sve_fmul_u,
+ Intrinsic::aarch64_sve_fmla_u>(
+ IC, II, true);
----------------
MattDevereau wrote:
> Is this case separate from `instCombineSVEVectorAdd` because the "undef" variant can't combine to fmla_u unless both the fmul and fadd are of the _u variants? Or because this case can't benefit from combines in `instCombineSVEVectorBinOp`?
The formerish. `fadd_m(pg, a, fmul_u(pg, b, c))` expects the inactive elements to come from `a`, which `fmla_u` does not guarantee. It's worth pointing out the opposite is a valid transformation (i.e. `fadd_u(pg, a, fmul_m(pg, b, c)) --> fmla_u(a, b, c)` but that's new and I have half a thought it'll be better to soften the `fmul_m` to `fmul_u` rather than jumping straight to `fmla_u`.
This does mean we're not getting the benefit of `instCombineSVEVectorBinOp` but here my plan is to rewrite `m` instrinsics that take an all active predicate to their equivalent `u` intrinsic, to minimise duplication.
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D144413/new/
https://reviews.llvm.org/D144413
More information about the llvm-commits
mailing list