[llvm] [LoopVectorizer][AArch64] Add support for partial reduce subtraction (PR #123636)

Nicholas Guy via llvm-commits llvm-commits at lists.llvm.org
Fri Feb 7 03:10:50 PST 2025


================
@@ -318,13 +332,20 @@ void VPPartialReductionRecipe::execute(VPTransformState &State) {
   State.setDebugLocFrom(getDebugLoc());
   auto &Builder = State.Builder;
 
-  assert(getOpcode() == Instruction::Add &&
-         "Unhandled partial reduction opcode");
-
   Value *BinOpVal = State.get(getOperand(0));
   Value *PhiVal = State.get(getOperand(1));
   assert(PhiVal && BinOpVal && "Phi and Mul must be set");
 
+  unsigned Opcode = getOpcode();
+
+  if (Opcode == Instruction::Sub) {
+    bool HasNSW = cast<Instruction>(BinOpVal)->hasNoSignedWrap();
+    BinOpVal = Builder.CreateNeg(BinOpVal, "", HasNSW);
+    Opcode = Instruction::Add;
+  }
----------------
NickGuy-Arm wrote:

Done, though it caused the cost model to evaluate the vplan as being more expensive, so doesn't emit scalable vectors by default in this case. (I believe it's a similar case to why `-vectorizer-maximize-bandwidth` is now a thing, but even with that fixed-width plans are chosen over scalable plans).

A workaround for this is to use the opt arguments `-force-vector-width=8/16 -scalable-vectorization=preferred` in tandem (or in C++ by using a loop pragma `vectorize_width(8/16, scalable)`) to force a VF that is supported by partial reductions.

https://github.com/llvm/llvm-project/pull/123636


More information about the llvm-commits mailing list