[PATCH] D151358: [LegalizeTypes] Improve expansion of wide SMIN/SMAX/UMIN/UMAX

Tue Jun 27 03:18:46 PDT 2023

bjope added a comment.

In D151358#4436222 <https://reviews.llvm.org/D151358#4436222>, @efriedma wrote:

> As far as I can tell, the issue for VE is just that an i128 icmp is lowered in an extremely inefficient way. There's no add-with-carry, and no custom lowering, so we end up using the generic expansion.  And the generic expansion is expensive because setcc is relatively expensive.  Looking at the instruction set, though, it should be possible to compute the flags for a 128-bit compare in three instructions: cmpu+cmpu+cmov; if that were implemented, the new expansion would actually be an improvement.
>
> Given that, I think it makes sense to just move forward with this as-is.  The regression will be automatically fixed when someone adds appropriate custom lowering for 128-bit math on VE.  It would be nice to get feedback from someone more familiar with the VE instruction set, though.

I think we got the same (or similar) problems downstream.

For a smaller example (like the one that you added in https://github.com/llvm/llvm-project/issues/63435) we get this in the DAG after type legalization + combine:

       t36: i16 = setcc t9, t13, setugt:ch
       t37: i16 = setcc t7, t11, setugt:ch
     t43: i16 = select_cc t7, t11, t36, t37, seteq:ch
  t35: i32 = select_cc t43, Constant:i16<0>, t4, t2, setne:ch

which basically lowers to cmpu, condmove, cmpu, condmove, cmps, condmove, cmps, condmove. And each condmove will end up as two moves in the final asm. So that pattern is currently quite expense.
Not sure exactly how we can improve that (one flag register and I think doing conditional compares in the DAG is a bit tricky). So those kinds of constructs are a bit messy and costly to handle.

The old expansion of min/max when we got

  X: i32 = smax t3, t7
    t39: i32 = umax t5, t9
    t42: i32 = select_cc t3, t7, t5, t9, setgt:ch
  Y: i32 = select_cc t3, t7, t39, t42, seteq:ch

instead of

      t41: i16 = setcc t5, t9, setugt:ch
      t42: i16 = setcc t3, t7, setgt:ch
    t49: i16 = select_cc t3, t7, t41, t42, seteq:ch
  X: i32 = select t49, t3, t7
  Y: i32 = select t49, t5, t9

seemed better considering that the smax and umax are legal for our target (while the setcc+setcc+select_cc pattern is expensive). Also the condmove:s that we get from the DAG with smax/umax is just selecting between the existing register values. The condmoves that operate on boolean contents (originating from the setcc pattern) also involves materializing the boolean constants in registers...

I guess I can try to add some custom legalization to get the old result downstream. Or maybe a reverse DAG combine (trying to detect that X simply is an smax out from that select+select_cc+setcc+setcc pattern).
Or maybe there should be some target lowering hooks to let targets override and use the old expansion instead of that new part that introduce the SETCC with non-legal types when expanding min/max?

================
Comment at: llvm/lib/CodeGen/SelectionDAG/LegalizeIntegerTypes.cpp:2994
+  // high halves of the operands.  Expand this way if it appears profitable.
+  if (RHSVal && (N->getOpcode() == ISD::UMIN || N->getOpcode() == ISD::UMAX) &&
+                 (RHSVal->countLeadingOnes() >= NumHalfBits ||
----------------
It is a bit unclear to me if this just adds heuristics ("it it appears profitable"), or if something inside the if actually is guarded by the conditions in this if.

To be more specific, if I still want to expand like this for a downstream target, can I just change it to

```
// Heuristic to decide if it is profitable to 
bool IsProfitable =
  (RHSVal && (N->getOpcode() == ISD::UMIN || N->getOpcode() == ISD::UMAX) &&
                 (RHSVal->countLeadingOnes() >= NumHalfBits ||
                  RHSVal->countLeadingZeros() >= NumHalfBits);
if (IsProfitable || TLI->avoidSetccInMinMaxExpand()) {
  ...
}
```

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D151358/new/

https://reviews.llvm.org/D151358