[llvm] [AArch64] Add partial reduce patterns for new fdot instructions (PR #184659)
Sander de Smalen via llvm-commits
llvm-commits at lists.llvm.org
Thu Mar 5 03:34:30 PST 2026
================
@@ -2031,6 +2038,8 @@ AArch64TargetLowering::AArch64TargetLowering(const TargetMachine &TM,
// We can use SVE2p1 fdot to emulate the fixed-length variant.
setPartialReduceMLAAction(ISD::PARTIAL_REDUCE_FMLA, MVT::v4f32,
MVT::v8f16, Custom);
+ setPartialReduceMLAAction(ISD::PARTIAL_REDUCE_FMLA, MVT::v2f32,
+ MVT::v4f16, Custom);
----------------
sdesmalen-arm wrote:
Doing this is actually not correct unless we know that the `vscale_range` is `(1,1)`, see: https://github.com/llvm/llvm-project/pull/181982#discussion_r2822667752. The same holds for the (existing) pattern right above it, as it might introduce faulting behaviour for the inactive lanes if the runtime vector length > 128bits.
Can you remove this change from the PR and either remove the other case as well (v8f16->v4f32) or follow it up with a PR to remove the fixed-length variant entirely (I guess we could handle it by zeroing the inactive lanes first, but I'm not sure using the instruction will still be worth it then)
https://github.com/llvm/llvm-project/pull/184659
More information about the llvm-commits
mailing list