[PATCH] D116270: [AMDGPU] Enable divergence-driven XNOR selection

Mon Jan 10 08:04:36 PST 2022

foad added a comment.

In D116270#3231657 <https://reviews.llvm.org/D116270#3231657>, @alex-t wrote:

> Once again, in my case BOTH nodes (not,xor) are divergent!
>
>    %s.load = load i32, i32 addrspace(4)* %s.kernarg.offset.cast, align 4, !invariant.load !0
>   DIVERGENT:       %v = call i32 @llvm.amdgcn.workitem.id.x(), !range !1
>   DIVERGENT:       %xor = xor i32 %v, %s.load
>   DIVERGENT:       %d = xor i32 %xor, -1
>   DIVERGENT:       store i32 %d, i32 addrspace(1)* %out.load, align 4

I know. I am suggesting that a DAG combine can rewrite this code to the equivalent of:

                   %s.load = load i32, i32 addrspace(4)* %s.kernarg.offset.cast, align 4, !invariant.load !0
  DIVERGENT:       %v = call i32 @llvm.amdgcn.workitem.id.x(), !range !1
                   %not = xor i32 %s.load, -1
  DIVERGENT:       %d = xor i32 %v, %not
  DIVERGENT:       store i32 %d, i32 addrspace(1)* %out.load, align 4

Now %not is uniform, so it is trivial to select it to s_not.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D116270/new/

https://reviews.llvm.org/D116270