[PATCH] D65088: [AMDGPU][RFC] New llvm.amdgcn.ballot intrinsic
Sebastian Neubauer via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Wed Mar 11 01:36:01 PDT 2020
Flakebi updated this revision to Diff 249563.
Flakebi added a comment.
Address Nicolai’s comments and implement this as DAG combines and TableGen patterns.
The code generation for test2 is currently not optimal:
%trunc = trunc i32 %x to i1
%ballot = call i64 @llvm.amdgcn.ballot.i64(i1 %trunc)
generates
v_and_b32_e32 v0, 1, v0
v_cmp_eq_u32_e32 vcc, 1, v0
s_and_b64 s[4:5], vcc, exec
where the first compare stems from the truncate.
We could handle the case of (ballot (truncate x)) in the combining. Any opinions on this?
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D65088/new/
https://reviews.llvm.org/D65088
Files:
llvm/include/llvm/IR/IntrinsicsAMDGPU.td
llvm/lib/Target/AMDGPU/AMDGPUAtomicOptimizer.cpp
llvm/lib/Target/AMDGPU/AMDGPUTargetTransformInfo.cpp
llvm/lib/Target/AMDGPU/SIISelLowering.cpp
llvm/lib/Target/AMDGPU/SIISelLowering.h
llvm/lib/Target/AMDGPU/SIInstructions.td
llvm/lib/Transforms/InstCombine/InstCombineCalls.cpp
llvm/test/CodeGen/AMDGPU/atomic_optimizations_buffer.ll
llvm/test/CodeGen/AMDGPU/atomic_optimizations_global_pointer.ll
llvm/test/CodeGen/AMDGPU/atomic_optimizations_local_pointer.ll
llvm/test/CodeGen/AMDGPU/atomic_optimizations_pixelshader.ll
llvm/test/CodeGen/AMDGPU/atomic_optimizations_raw_buffer.ll
llvm/test/CodeGen/AMDGPU/atomic_optimizations_struct_buffer.ll
llvm/test/CodeGen/AMDGPU/llvm.amdgcn.ballot.i32.ll
llvm/test/CodeGen/AMDGPU/llvm.amdgcn.ballot.i64.ll
llvm/test/Transforms/InstCombine/AMDGPU/amdgcn-intrinsics.ll
-------------- next part --------------
A non-text attachment was scrubbed...
Name: D65088.249563.patch
Type: text/x-patch
Size: 130318 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20200311/c0c82e50/attachment-0001.bin>
More information about the llvm-commits
mailing list