[llvm] [AArch64][SVE] Add codegen support for partial reduction lowering to wide add instructions (PR #114406)

James Chesterman via llvm-commits llvm-commits at lists.llvm.org
Thu Nov 7 06:56:41 PST 2024


================
@@ -2041,8 +2041,13 @@ bool AArch64TargetLowering::shouldExpandPartialReductionIntrinsic(
     return true;
 
   EVT VT = EVT::getEVT(I->getType());
-  return VT != MVT::nxv4i64 && VT != MVT::nxv4i32 && VT != MVT::nxv2i64 &&
-         VT != MVT::v4i64 && VT != MVT::v4i32 && VT != MVT::v2i32;
+  auto Op1 = I->getOperand(1);
+  EVT Op1VT = EVT::getEVT(Op1->getType());
+  if (Op1VT.getVectorElementType() == VT.getVectorElementType() &&
+      (VT.getVectorElementCount() * 4 == Op1VT.getVectorElementCount() ||
+       VT.getVectorElementCount() * 2 == Op1VT.getVectorElementCount()))
----------------
JamesChesterman wrote:

I did this with the view of implementing type legalisation I think. For example, we do want a partial reduction from `nxv4i16` to `nxv2i32` (after the input has been extended) to be possible right? And I thought that all that would be needed to be done in checking the wide add instruction is available is checking whether SVE was enabled, or is there something I'm missing?

https://github.com/llvm/llvm-project/pull/114406


More information about the llvm-commits mailing list