[PATCH] D151358: [LegalizeTypes] Improve expansion of wide SMIN/SMAX/UMIN/UMAX

Wed Jun 28 06:47:34 PDT 2023

bjope added a comment.

In D151358#4453100 <https://reviews.llvm.org/D151358#4453100>, @efriedma wrote:

> I think any of the following makes the new expansion cheaper:
>
> - A subtract-with-borrow/carry instruction
> - A cheap setcc/select
> - Some cheap way to do a "select" on flags.
> - Branches are reasonably cheap, so you can expand the select-of-select construct to branches.
>
> If your target has none of those, a target lowering hook is reasonable.  Seems a little surprising to me, though; all targets I know of with a dedicated "flags" register have a subtract-with-borrow.

I looked a bit at what happens for other targets, and I see that the SETCC often ends up being expanded into USUBO+SETCCCARRY. Maybe I could add some support for those DAG nodes, but I doubt that the end result would be good for the min/max cases anyway.

A nice lowering for a VLIW target (with min/max machine instructions, predication, etc,) would be something like this for unsigned max:

  {
    umax x.hi, y.hi => res.hi          ; res.hi
    umax x.lo, y.lo => res.lo          ; res.lo when x.hi==y.hi
    compare x.hi, y.hi => CC
  }
  {
    condmove x.lo => res.lo  if CC.ugt     ; res.lo when x.hi "won"
    condmove y.lo => res.lo, if CC.ult     ; res.lo when y.hi "won"
  }

So that would be just two bundles, no branches and only a single compare to update CC.
It also matches quite good with the old expansion on a selection DAG level.

I figure I would need to find a way to map those setcc/selectcc operations back to using max/min to get something nearly as good as that (at least if being limited by a single CC flag register).

================
Comment at: llvm/lib/CodeGen/SelectionDAG/LegalizeIntegerTypes.cpp:2994
+  // high halves of the operands.  Expand this way if it appears profitable.
+  if (RHSVal && (N->getOpcode() == ISD::UMIN || N->getOpcode() == ISD::UMAX) &&
+                 (RHSVal->countLeadingOnes() >= NumHalfBits ||
----------------
efriedma wrote:
> bjope wrote:
> > It is a bit unclear to me if this just adds heuristics ("it it appears profitable"), or if something inside the if actually is guarded by the conditions in this if.
> > 
> > To be more specific, if I still want to expand like this for a downstream target, can I just change it to
> > 
> > ```
> > // Heuristic to decide if it is profitable to 
> > bool IsProfitable =
> >   (RHSVal && (N->getOpcode() == ISD::UMIN || N->getOpcode() == ISD::UMAX) &&
> >                  (RHSVal->countLeadingOnes() >= NumHalfBits ||
> >                   RHSVal->countLeadingZeros() >= NumHalfBits);
> > if (IsProfitable || TLI->avoidSetccInMinMaxExpand()) {
> >   ...
> > }
> > ```
> The expansion is fully general; you could use it for anything.  The guard is just checking for profitability (whether the upper umin/umax will simplify).
Thanks for confirming.

That makes it easy for us to add our target to the profitability check to use the old expansion in more cases. 

I'll handle that downstream for now, since I don't know about any in-tree target that I could use in a test case to show the need for a target lowering hook.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D151358/new/

https://reviews.llvm.org/D151358