[llvm] [AArch64][SVE] Add codegen support for partial reduction lowering to wide add instructions (PR #114406)
    Benjamin Maxwell via llvm-commits 
    llvm-commits at lists.llvm.org
       
    Thu Nov  7 05:55:20 PST 2024
    
    
  
================
@@ -2041,8 +2041,13 @@ bool AArch64TargetLowering::shouldExpandPartialReductionIntrinsic(
     return true;
 
   EVT VT = EVT::getEVT(I->getType());
-  return VT != MVT::nxv4i64 && VT != MVT::nxv4i32 && VT != MVT::nxv2i64 &&
-         VT != MVT::v4i64 && VT != MVT::v4i32 && VT != MVT::v2i32;
+  auto Op1 = I->getOperand(1);
+  EVT Op1VT = EVT::getEVT(Op1->getType());
+  if (Op1VT.getVectorElementType() == VT.getVectorElementType() &&
+      (VT.getVectorElementCount() * 4 == Op1VT.getVectorElementCount() ||
+       VT.getVectorElementCount() * 2 == Op1VT.getVectorElementCount()))
----------------
MacDue wrote:
It looks like it should fallback to `DAG.getPartialReduceAdd()` (in `performIntrinsicCombine()`), but `tryLowerPartialReductionToWideAdd()` does not seem to check if there is a wide add instruction available. 
https://github.com/llvm/llvm-project/pull/114406
    
    
More information about the llvm-commits
mailing list