[llvm] [SelectionDAG] Fold (icmp eq/ne (shift X, C), 0) -> (icmp eq/ne X, 0) (PR #88801)

Tue May 7 03:22:56 PDT 2024

================
@@ -4516,6 +4516,35 @@ SDValue TargetLowering::SimplifySetCC(EVT VT, SDValue N0, SDValue N1,
         }
       }
     }
+
+    // Optimize
+    //    (setcc (shift N00, N01C), 0, eq/ne) -> (setcc N00, 0, eq/ne)
+    // If all shifted out bits are known to be zero, then the zero'd ness
+    // doesn't change and we can omit the shift.
+    // If all shifted out bits are equal to at least one bit that isn't
+    // shifted out, then the zero'd ness doesn't change and we can omit the
+    // shift.
+    if ((Cond == ISD::SETEQ || Cond == ISD::SETNE) && C1.isZero() &&
+        N0.hasOneUse() &&
+        (N0.getOpcode() == ISD::SHL || N0.getOpcode() == ISD::SRL ||
+         N0.getOpcode() == ISD::SRA)) {
+      bool IsRightShift = N0.getOpcode() != ISD::SHL;
+      SDValue N00 = N0.getOperand(0);
+      // Quick checks based on exact/nuw/nsw flags.
+      if (IsRightShift ? N0->getFlags().hasExact()
+                       : (N0->getFlags().hasNoUnsignedWrap() ||
+                          N0->getFlags().hasNoSignedWrap()))
+        return DAG.getSetCC(dl, VT, N00, N1, Cond);
+      // More expensive checks based on known bits.
+      if (const APInt *ShAmt = DAG.getValidMaximumShiftAmountConstant(N0)) {
+        KnownBits Known = DAG.computeKnownBits(N00);
+        if (IsRightShift)
+          Known = Known.reverseBits();
+        if (ShAmt->ule(Known.countMinLeadingZeros()) ||
+            ShAmt->ult(Known.countMinSignBits()))
----------------
bjope wrote:

IMHO it is a waste of time to derive these things, just in case someone wants to use them in the future. We do run DAG combiner several times, so for example adding some logic in visitSHL to try to derive the flags, potentially failing each time, seems like a real waste if there is no DAG combine that is interested in knowing that information. 

And I frankly don't know if poison generating flags is the solution for caching value tracking results. We could just as well wanna cache that we failed to derive that the shift is `exact`, or that a particular shift is known to not be `exact`. But not sure how to deal with that in a nice way.

I think it is much better to find a way were we can compute these things lazily when a fold is asking for the information, rather than expecting that the flags always are set when possible. OTOH, when some optimization is interested to know these kinds of properties, and derives information via value tracking, then we might wanna cache that information somehow to avoid having to compute it again (mutate the DAG and add the flags).

In this particular case we will fold away the shift if the flags are set. So there is no shift left to add the flags to after doing the fold.

Regardless, if the community decides that flags always should be derived up-front instead of being lazily computed based on need, then implementing that strategy is out-of-scope for this pull request. I think we really need a much better framework to automatically add flags when creating new nodes, as well as much better documentation regarding which posion generating flags that are mandatory to set on which DAG nodes (including some kind of verifier support) if we want to go that way.

https://github.com/llvm/llvm-project/pull/88801