[llvm] [LoopVectorizer][AArch64] Add support for partial reduce subtraction (PR #123636)
Nicholas Guy via llvm-commits
llvm-commits at lists.llvm.org
Fri Feb 7 03:10:50 PST 2025
================
@@ -318,13 +332,20 @@ void VPPartialReductionRecipe::execute(VPTransformState &State) {
State.setDebugLocFrom(getDebugLoc());
auto &Builder = State.Builder;
- assert(getOpcode() == Instruction::Add &&
- "Unhandled partial reduction opcode");
-
Value *BinOpVal = State.get(getOperand(0));
Value *PhiVal = State.get(getOperand(1));
assert(PhiVal && BinOpVal && "Phi and Mul must be set");
+ unsigned Opcode = getOpcode();
+
+ if (Opcode == Instruction::Sub) {
+ bool HasNSW = cast<Instruction>(BinOpVal)->hasNoSignedWrap();
+ BinOpVal = Builder.CreateNeg(BinOpVal, "", HasNSW);
+ Opcode = Instruction::Add;
+ }
----------------
NickGuy-Arm wrote:
Done, though it caused the cost model to evaluate the vplan as being more expensive, so doesn't emit scalable vectors by default in this case. (I believe it's a similar case to why `-vectorizer-maximize-bandwidth` is now a thing, but even with that fixed-width plans are chosen over scalable plans).
A workaround for this is to use the opt arguments `-force-vector-width=8/16 -scalable-vectorization=preferred` in tandem (or in C++ by using a loop pragma `vectorize_width(8/16, scalable)`) to force a VF that is supported by partial reductions.
https://github.com/llvm/llvm-project/pull/123636
More information about the llvm-commits
mailing list