[PATCH] D116270: [AMDGPU] Enable divergence-driven XNOR selection

Thu Jan 20 09:04:08 PST 2022

alex-t added inline comments.

================
Comment at: llvm/lib/Target/AMDGPU/SIISelLowering.cpp:12491-12493
+    if (isCommutativeBinOp(N1->getOpcode()) &&
+        DAG.isConstantIntBuildVectorOrConstantInt(N1->getOperand(1)))
+      return true;
----------------
foad wrote:
> I don't understand this heuristic. Can you give an example of when it would help?
I could just demonstrate the concrete example but I would need to paste the DAGs here that look like overkill.  So, I try to explain w/o the drawing. 
Let's imagine we have a sub-tree constituting the commutative arithmetic operations.
Let us have a path in the tree such that each node has at least one operant constant.
Given that it is very likely that this sub-tree is going to be simplified by the combiner by application arithmetic rules and constant folding.
This heuristic states the priority of such constant folding over keeping the outer node uniform.

```
  %and = and i32 %tmp, 16711935     ; 0x00ff00ff
  %tmp1 = and i32 %arg1, 4294967040 ; 0xffffff00
  %tmp2 = or i32 %tmp1, -65536
  %tmp3 = or i32 %tmp2, %and

```
This is folded and can be selected to v_perm_b32 with this heuristic but will be 4 scalar operations w/o it.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D116270/new/

https://reviews.llvm.org/D116270