[llvm] [AArch64] Improve code generation of bool vector reduce operations (PR #115713)
Nikita Popov via llvm-commits
llvm-commits at lists.llvm.org
Wed Dec 11 02:02:36 PST 2024
=?utf-8?q?Csanád_Hajdú?= <csanad.hajdu at arm.com>,
=?utf-8?q?Csanád_Hajdú?= <csanad.hajdu at arm.com>
Message-ID:
In-Reply-To: <llvm.org/llvm/llvm-project/pull/115713 at github.com>
================
@@ -15841,11 +15841,27 @@ static SDValue getVectorBitwiseReduce(unsigned Opcode, SDValue Vec, EVT VT,
return getVectorBitwiseReduce(Opcode, HalfVec, VT, DL, DAG);
}
- // Vectors that are less than 64 bits get widened to neatly fit a 64 bit
- // register, so e.g. <4 x i1> gets lowered to <4 x i16>. Sign extending to
+ // Results of setcc operations get widened to 128 bits for xor reduce if
+ // their input operands are 128 bits wide, otherwise vectors that are less
+ // than 64 bits get widened to neatly fit a 64 bit register, so e.g.
+ // <4 x i1> gets lowered to either <4 x i16> or <4 x i32>. Sign extending to
// this element size leads to the best codegen, since e.g. setcc results
// might need to be truncated otherwise.
- EVT ExtendedVT = MVT::getIntegerVT(std::max(64u / NumElems, 8u));
+ unsigned ExtendedWidth = 64;
+ if (ScalarOpcode == ISD::XOR && Vec.getOpcode() == ISD::SETCC &&
+ Vec.getOperand(0).getValueSizeInBits() >= 128) {
+ ExtendedWidth = 128;
+ }
+ EVT ExtendedVT = MVT::getIntegerVT(std::max(ExtendedWidth / NumElems, 8u));
+
+ // Negate the reduced vector value for reduce and operations that use
+ // fcmp.
+ if (ScalarOpcode == ISD::AND && NumElems < 16) {
+ Vec = DAG.getNode(
+ ISD::XOR, DL, VecVT, Vec,
+ DAG.getSplatVector(
+ VecVT, DL, DAG.getConstant(APInt::getAllOnes(32), DL, MVT::i32)));
----------------
nikic wrote:
FYI there is a `DAG.getAllOnesConstant(DL, MVT::i32)` helper for this.
https://github.com/llvm/llvm-project/pull/115713
More information about the llvm-commits
mailing list