[PATCH] D148710: [DAGCombiner] Hoist add/sub binop w/ constant op only if it won't increase divergency node

Sat Apr 22 06:03:13 PDT 2023

foad added inline comments.

================
Comment at: llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp:3874
   // Hoist one-use addition by non-opaque constant:
-  //   (x + C) - y  ->  (x - y) + C
-  if (N0.getOpcode() == ISD::ADD && N0.hasOneUse() &&
-      isConstantOrConstantVector(N0.getOperand(1), /*NoOpaques=*/true)) {
-    SDValue Sub = DAG.getNode(ISD::SUB, DL, VT, N0.getOperand(0), N1);
-    return DAG.getNode(ISD::ADD, DL, VT, Sub, N0.getOperand(1));
-  }
-  // y - (x + C)  ->  (y - x) - C
-  if (N1.getOpcode() == ISD::ADD && N1.hasOneUse() &&
-      isConstantOrConstantVector(N1.getOperand(1), /*NoOpaques=*/true)) {
-    SDValue Sub = DAG.getNode(ISD::SUB, DL, VT, N0, N1.getOperand(0));
-    return DAG.getNode(ISD::SUB, DL, VT, Sub, N1.getOperand(1));
-  }
-  // (x - C) - y  ->  (x - y) - C
-  // This is necessary because SUB(X,C) -> ADD(X,-C) doesn't work for vectors.
-  if (N0.getOpcode() == ISD::SUB && N0.hasOneUse() &&
-      isConstantOrConstantVector(N0.getOperand(1), /*NoOpaques=*/true)) {
-    SDValue Sub = DAG.getNode(ISD::SUB, DL, VT, N0.getOperand(0), N1);
-    return DAG.getNode(ISD::SUB, DL, VT, Sub, N0.getOperand(1));
-  }
-  // (C - x) - y  ->  C - (x + y)
-  if (N0.getOpcode() == ISD::SUB && N0.hasOneUse() &&
-      isConstantOrConstantVector(N0.getOperand(0), /*NoOpaques=*/true)) {
-    SDValue Add = DAG.getNode(ISD::ADD, DL, VT, N0.getOperand(1), N1);
-    return DAG.getNode(ISD::SUB, DL, VT, N0.getOperand(0), Add);
+  if (N0->isDivergent() == N1->isDivergent()) {
+    //   (x + C) - y  ->  (x - y) + C
----------------
bcl5980 wrote:
> foad wrote:
> > foad wrote:
> > > bcl5980 wrote:
> > > > foad wrote:
> > > > > bcl5980 wrote:
> > > > > > foad wrote:
> > > > > > > bcl5980 wrote:
> > > > > > > > foad wrote:
> > > > > > > > > We also want to do this if x is divergent and y is uniform.
> > > > > > > > If x is divergent and y is uniform, we can transform to 
> > > > > > > > (x + C) - y --> x + (C - y)
> > > > > > > > y - (x + C) --> (y - C) - x
> > > > > > > > (x - C) -y   --> x - (C + y) ;; this is speical case that AMDGPU doesnt' support op1 as sgpr, so this case we needn't optimize it. But this should be a TLI behaivor. Generally I think we still can't enable it.
> > > > > > > > (C - x) - y --> (C -y) -x
> > > > > > > > 
> > > > > > > > So I believe in DAGCombiner we only need to enable it when the divergency property is the same.
> > > > > > > > 
> > > > > > > > 
> > > > > > > The rules we implement in DAGCombiner should make sense without having to understand exactly what any particular target can do.
> > > > > > > 
> > > > > > > The original rule here was "pull constants out of nested adds/subs".
> > > > > > > 
> > > > > > > The new rule is "pull constants out of nested adds/subs, unless that increases the number of divergent nodes in the dag".
> > > > > > Yeah, the rule is correct. But it means (N0->isDivergent() == N1->isDivergent()) not (N0->isDivergent() || !N1->isDivergent()) I think.
> > > > > These reassociations only increase the number of divergent nodes in one case: when x is uniform and y is divergent.
> > > > There are 3 operators,  1 operator is uniform, 1 is divergency, 1 is constant, we can always do ` (uniform op constant) op divergency`.
> > > > So we haven't use any particular target information here. I still think `N0->isDivergent() == N1->isDivergent()` is better and cleaner. 
> > > > But if you insist I can change to your way.
> > > > 
> > > > 
> > > If you use `N0->isDivergent() == N1->isDivergent()` then this target independent code will not transform: `(divergent op constant) op uniform` into `(divergent op uniform) op constant`
> > > I think it *should* do that transform, because:
> > > 1. pulling constants out is generally useful (that is why this code exists in the first place)
> > > 2. it does not increase the number of divergent nodes - both before and after the transform, there are two divergent "op" nodes
> > > Can you explain why it should *not* do the transform?
> > > But if you insist I can change to your way.
> > I don't want to insist, I want to reach agreement :)
> I think it’s unnecessary because that case should be transformed to:
> ‘(uniform op constant) op divergent’
> Hoist constant can get potential benefits but uniform instruction can get real benefits. 
> I think it’s unnecessary because that case should be transformed to:
> ‘(uniform op constant) op divergent’
That sounds fine but it is implemented yet (in a target-independent way)? Or can you implement it in this patch or in a follow-up?

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D148710/new/

https://reviews.llvm.org/D148710