[llvm] [X86] Remove redundant TEST after shifts when count is non-zero (PR #169069)

Tue Nov 25 05:48:51 PST 2025

================
@@ -23625,6 +23625,60 @@ static SDValue EmitTest(SDValue Op, X86::CondCode X86CC, const SDLoc &dl,
     return DAG.getNode(X86ISD::SUB, dl, VTs, Op->getOperand(0),
                        Op->getOperand(1)).getValue(1);
   }
+  case ISD::SHL:
+  case ISD::SRL:
+  case ISD::SRA: {
+    SDValue Amt = ArithOp.getOperand(1);
+
+    // Skip Constants
+    if (isa<ConstantSDNode>(Amt))
+      break;
+
+    // If optimising for size and can guarantee the shift amt is never zero
+    // the test.
+    bool OptForSize = DAG.getMachineFunction().getFunction().hasOptSize();
+
+    if (!OptForSize)
----------------
RKSimon wrote:

If you have access to a range of CPUs, I'd be interested to know if there are enough that have fast shift eflags handling to be worthy of a tuning flag that would allow them to be used in all cases, not just for optsize builds. A quick review of uops.info suggested Intel CPUs started going multi-uop after SandyBridge, but Atoms/ECores never did and neither did AMD - it'd be useful to see any runtime benchmarks that back this up.

https://github.com/llvm/llvm-project/pull/169069