[PATCH] D116270: [AMDGPU] Enable divergence-driven XNOR selection
Jay Foad via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Mon Jan 10 08:04:36 PST 2022
foad added a comment.
In D116270#3231657 <https://reviews.llvm.org/D116270#3231657>, @alex-t wrote:
> Once again, in my case BOTH nodes (not,xor) are divergent!
>
> %s.load = load i32, i32 addrspace(4)* %s.kernarg.offset.cast, align 4, !invariant.load !0
> DIVERGENT: %v = call i32 @llvm.amdgcn.workitem.id.x(), !range !1
> DIVERGENT: %xor = xor i32 %v, %s.load
> DIVERGENT: %d = xor i32 %xor, -1
> DIVERGENT: store i32 %d, i32 addrspace(1)* %out.load, align 4
I know. I am suggesting that a DAG combine can rewrite this code to the equivalent of:
%s.load = load i32, i32 addrspace(4)* %s.kernarg.offset.cast, align 4, !invariant.load !0
DIVERGENT: %v = call i32 @llvm.amdgcn.workitem.id.x(), !range !1
%not = xor i32 %s.load, -1
DIVERGENT: %d = xor i32 %v, %not
DIVERGENT: store i32 %d, i32 addrspace(1)* %out.load, align 4
Now %not is uniform, so it is trivial to select it to s_not.
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D116270/new/
https://reviews.llvm.org/D116270
More information about the llvm-commits
mailing list