[llvm] [AArch64][SVE] Add codegen support for partial reduction lowering to wide add instructions (PR #114406)
James Chesterman via llvm-commits
llvm-commits at lists.llvm.org
Fri Nov 8 03:03:24 PST 2024
================
@@ -2041,8 +2041,13 @@ bool AArch64TargetLowering::shouldExpandPartialReductionIntrinsic(
return true;
EVT VT = EVT::getEVT(I->getType());
- return VT != MVT::nxv4i64 && VT != MVT::nxv4i32 && VT != MVT::nxv2i64 &&
- VT != MVT::v4i64 && VT != MVT::v4i32 && VT != MVT::v2i32;
+ auto Op1 = I->getOperand(1);
+ EVT Op1VT = EVT::getEVT(Op1->getType());
+ if (Op1VT.getVectorElementType() == VT.getVectorElementType() &&
+ (VT.getVectorElementCount() * 4 == Op1VT.getVectorElementCount() ||
+ VT.getVectorElementCount() * 2 == Op1VT.getVectorElementCount()))
----------------
JamesChesterman wrote:
This is now resolved with an additional check in the `tryLowerPartialReductionToWideAdd` function.
https://github.com/llvm/llvm-project/pull/114406
More information about the llvm-commits
mailing list