[llvm] [AArch64][SVE] Add codegen support for partial reduction lowering to wide add instructions (PR #114406)
Benjamin Maxwell via llvm-commits
llvm-commits at lists.llvm.org
Thu Nov 7 05:55:20 PST 2024
================
@@ -2041,8 +2041,13 @@ bool AArch64TargetLowering::shouldExpandPartialReductionIntrinsic(
return true;
EVT VT = EVT::getEVT(I->getType());
- return VT != MVT::nxv4i64 && VT != MVT::nxv4i32 && VT != MVT::nxv2i64 &&
- VT != MVT::v4i64 && VT != MVT::v4i32 && VT != MVT::v2i32;
+ auto Op1 = I->getOperand(1);
+ EVT Op1VT = EVT::getEVT(Op1->getType());
+ if (Op1VT.getVectorElementType() == VT.getVectorElementType() &&
+ (VT.getVectorElementCount() * 4 == Op1VT.getVectorElementCount() ||
+ VT.getVectorElementCount() * 2 == Op1VT.getVectorElementCount()))
----------------
MacDue wrote:
It looks like it should fallback to `DAG.getPartialReduceAdd()` (in `performIntrinsicCombine()`), but `tryLowerPartialReductionToWideAdd()` does not seem to check if there is a wide add instruction available.
https://github.com/llvm/llvm-project/pull/114406
More information about the llvm-commits
mailing list