[llvm] [AArch64] Improve code generation of bool vector reduce operations (PR #115713)

Tue Nov 12 01:16:16 PST 2024

================
@@ -15833,10 +15848,36 @@ static SDValue getVectorBitwiseReduce(unsigned Opcode, SDValue Vec, EVT VT,
         ExtendOp, DL, VecVT.changeVectorElementType(ExtendedVT), Vec);
     switch (ScalarOpcode) {
     case ISD::AND:
-      Result = DAG.getNode(ISD::VECREDUCE_UMIN, DL, ExtendedVT, Extended);
+      if (NumElems < 16) {
+        // Check if all lanes of the negated bool vector value are zero by
+        // comparing against 0.0 with ordered and equal predicate. The only
+        // non-zero bit pattern that compares ordered and equal to 0.0 is -0.0,
+        // where only the sign bit is set. However the bool vector is
+        // sign-extended so that each bit in a lane is either zero or one,
+        // meaning that it is impossible to get the bit pattern of -0.0.
+        assert(Extended.getValueSizeInBits() == 64);
+        Extended = DAG.getBitcast(MVT::f64, Extended);
+        Result = DAG.getNode(ISD::SETCC, DL, MVT::i32, Extended,
+                             DAG.getConstantFP(0.0, DL, MVT::f64),
+                             DAG.getCondCode(ISD::CondCode::SETOEQ));
----------------
Il-Capitano wrote:

I should use `DAG.getSetCC` here.

https://github.com/llvm/llvm-project/pull/115713