[PATCH] D116270: [AMDGPU] Enable divergence-driven XNOR selection
Jay Foad via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Thu Jan 20 09:25:02 PST 2022
foad added inline comments.
================
Comment at: llvm/lib/Target/AMDGPU/SIISelLowering.cpp:12491-12493
+ if (isCommutativeBinOp(N1->getOpcode()) &&
+ DAG.isConstantIntBuildVectorOrConstantInt(N1->getOperand(1)))
+ return true;
----------------
alex-t wrote:
> foad wrote:
> > I don't understand this heuristic. Can you give an example of when it would help?
> I could just demonstrate the concrete example but I would need to paste the DAGs here that look like overkill. So, I try to explain w/o the drawing.
> Let's imagine we have a sub-tree constituting the commutative arithmetic operations.
> Let us have a path in the tree such that each node has at least one operant constant.
> Given that it is very likely that this sub-tree is going to be simplified by the combiner by application arithmetic rules and constant folding.
> This heuristic states the priority of such constant folding over keeping the outer node uniform.
>
> ```
> %and = and i32 %tmp, 16711935 ; 0x00ff00ff
> %tmp1 = and i32 %arg1, 4294967040 ; 0xffffff00
> %tmp2 = or i32 %tmp1, -65536
> %tmp3 = or i32 %tmp2, %and
>
> ```
> This is folded and can be selected to v_perm_b32 with this heuristic but will be 4 scalar operations w/o it.
I still don't see why this would be useful //in general//. I think it means we should do this reassociation:
`(op (op n00, C), (op2 n10, C2)) --> (op (op n00, (op2 n10, C2)), C)`
where op2 is commutative but not necessarily the same as op. E.g. `(x|C)|(z&C2) --> (x|(z&C2))|C`
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D116270/new/
https://reviews.llvm.org/D116270
More information about the llvm-commits
mailing list