[PATCH] D56636: [CodeGen][X86] Expand vector USUBSAT to UMAX+SUB

Sun Jan 13 10:22:19 PST 2019

RKSimon added inline comments.

================
Comment at: lib/CodeGen/SelectionDAG/TargetLowering.cpp:5317
+    // usub.sat(a, b) -> umax(a, b) - b
+    if (Opcode == ISD::USUBSAT && isOperationLegal(ISD::UMAX, VT)) {
+      SDValue Max = DAG.getNode(ISD::UMAX, dl, VT, LHS, RHS);
----------------
nikic wrote:
> RKSimon wrote:
> > isOperationLegalOrCustom?
> Extra diff allowing custom: https://gist.github.com/nikic/4c46634cec8f319e687c6b5cb0496648
> 
> This is presumably better than scalarizing, but I was wondering if a more explicit expansion would work better?
> 
> As another variant, this is what happens if I just fall through to the USUBO+SELECT expansion: https://gist.github.com/nikic/d989121f7f9898437a1255548c148904
Allowing legalorcustom looks OK to me - we can tweak per-target codegen in future commits if its useful.

Also, should we try to do this on scalars as well before defaulting to add/sub overflow? For instance AMDGPU i32 for instance should benefit.

================
Comment at: lib/Target/X86/X86ISelLowering.cpp:1224
+    if (!HasInt256)
+      setOperationAction(ISD::USUBSAT, MVT::v8i32, Custom);

----------------
nikic wrote:
> RKSimon wrote:
> > nikic wrote:
> > > RKSimon wrote:
> > > > nikic wrote:
> > > > > RKSimon wrote:
> > > > > > LowerADDSAT_SUBSAT supports AVX1 splitting, so why not:
> > > > > > ```
> > > > > > setOperationAction(ISD::USUBSAT, MVT::v8i32, HasInt256 ? Legal : Custom);
> > > > > > ```
> > > > > > 
> > > > > > Also, v4i64  should be ok as well.
> > > > > I've added this to get the splitting, but I don't think v8i32 would be legal, as we need it to expand. Maybe `hasInt256 ? Expand : Custom` would make it more obvious what the intention here is?
> > > > Better for the existing x86 lowering code to handle it IMO - that's what we do for nearly all 256-bit integer code on AVX1 targets.
> > > That's what this is intended to do. v8i32 is custom so lowering splits it into two v4i32s, which then get expanded.
> > > 
> > > If I change this to `HasInt256 ? Legal : Custom`, then I get:
> > > 
> > >     LLVM ERROR: Cannot select: t5: v8i32 = usubsat t2, t4
> > > 
> > > Or am I misunderstanding what you have in mind here?
> > If you look at LowerABS just below LowerADDSAT_SUBSAT you'll see the approach that I took.
> I've changed this to always Custom and changed the LowerADDSAT_SUBSAT implementation to look more like LowerABS. Is this right?
```
setOperationAction(ISD::USUBSAT, MVT::v8i32, HasInt256 ? Legal : Custom);
```

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D56636/new/

https://reviews.llvm.org/D56636