[PATCH] D116270: [AMDGPU] Enable divergence-driven XNOR selection

Mon Jan 10 08:09:24 PST 2022

alex-t added a comment.

In D116270#3231666 <https://reviews.llvm.org/D116270#3231666>, @foad wrote:

> In D116270#3231657 <https://reviews.llvm.org/D116270#3231657>, @alex-t wrote:
>
>> Once again, in my case BOTH nodes (not,xor) are divergent!
>>
>>    %s.load = load i32, i32 addrspace(4)* %s.kernarg.offset.cast, align 4, !invariant.load !0
>>   DIVERGENT:       %v = call i32 @llvm.amdgcn.workitem.id.x(), !range !1
>>   DIVERGENT:       %xor = xor i32 %v, %s.load
>>   DIVERGENT:       %d = xor i32 %xor, -1
>>   DIVERGENT:       store i32 %d, i32 addrspace(1)* %out.load, align 4
>
> I know. I am suggesting that a DAG combine can rewrite this code to the equivalent of:
>
>                    %s.load = load i32, i32 addrspace(4)* %s.kernarg.offset.cast, align 4, !invariant.load !0
>   DIVERGENT:       %v = call i32 @llvm.amdgcn.workitem.id.x(), !range !1
>                    %not = xor i32 %s.load, -1
>   DIVERGENT:       %d = xor i32 %v, %not
>   DIVERGENT:       store i32 %d, i32 addrspace(1)* %out.load, align 4
>
> Now %not is uniform, so it is trivial to select it to s_not.

Okay. I have finally got the idea.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D116270/new/

https://reviews.llvm.org/D116270