[PATCH] D56636: [CodeGen][X86] Expand vector USUBSAT to UMAX+SUB

Nikita Popov via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Sun Jan 13 11:36:43 PST 2019


nikic marked 3 inline comments as done.
nikic added inline comments.


================
Comment at: lib/CodeGen/SelectionDAG/TargetLowering.cpp:5317
+    // usub.sat(a, b) -> umax(a, b) - b
+    if (Opcode == ISD::USUBSAT && isOperationLegal(ISD::UMAX, VT)) {
+      SDValue Max = DAG.getNode(ISD::UMAX, dl, VT, LHS, RHS);
----------------
RKSimon wrote:
> nikic wrote:
> > RKSimon wrote:
> > > isOperationLegalOrCustom?
> > Extra diff allowing custom: https://gist.github.com/nikic/4c46634cec8f319e687c6b5cb0496648
> > 
> > This is presumably better than scalarizing, but I was wondering if a more explicit expansion would work better?
> > 
> > As another variant, this is what happens if I just fall through to the USUBO+SELECT expansion: https://gist.github.com/nikic/d989121f7f9898437a1255548c148904
> Allowing legalorcustom looks OK to me - we can tweak per-target codegen in future commits if its useful.
> 
> Also, should we try to do this on scalars as well before defaulting to add/sub overflow? For instance AMDGPU i32 for instance should benefit.
Okay, I did both changes. I agree that preferring this expansion makes sense even for scalar.

I've also dropped the changes in X86ISelLowering. Now that custom is allowed here, they are no longer necessary. We'll just expand to a wide UMAX here that will get split (rather than the USUBSAT getting split and then each half expanded to UMAX).


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D56636/new/

https://reviews.llvm.org/D56636





More information about the llvm-commits mailing list