[PATCH] D65088: [AMDGPU][RFC] New llvm.amdgcn.ballot intrinsic

Wed Mar 11 01:36:01 PDT 2020

Flakebi updated this revision to Diff 249563.
Flakebi added a comment.

Address Nicolai’s comments and implement this as DAG combines and TableGen patterns.

The code generation for test2 is currently not optimal:

  %trunc = trunc i32 %x to i1
  %ballot = call i64 @llvm.amdgcn.ballot.i64(i1 %trunc)

generates

  v_and_b32_e32 v0, 1, v0
  v_cmp_eq_u32_e32 vcc, 1, v0
  s_and_b64 s[4:5], vcc, exec

where the first compare stems from the truncate.
We could handle the case of (ballot (truncate x)) in the combining. Any opinions on this?

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D65088/new/

https://reviews.llvm.org/D65088

Files:
  llvm/include/llvm/IR/IntrinsicsAMDGPU.td
  llvm/lib/Target/AMDGPU/AMDGPUAtomicOptimizer.cpp
  llvm/lib/Target/AMDGPU/AMDGPUTargetTransformInfo.cpp
  llvm/lib/Target/AMDGPU/SIISelLowering.cpp
  llvm/lib/Target/AMDGPU/SIISelLowering.h
  llvm/lib/Target/AMDGPU/SIInstructions.td
  llvm/lib/Transforms/InstCombine/InstCombineCalls.cpp
  llvm/test/CodeGen/AMDGPU/atomic_optimizations_buffer.ll
  llvm/test/CodeGen/AMDGPU/atomic_optimizations_global_pointer.ll
  llvm/test/CodeGen/AMDGPU/atomic_optimizations_local_pointer.ll
  llvm/test/CodeGen/AMDGPU/atomic_optimizations_pixelshader.ll
  llvm/test/CodeGen/AMDGPU/atomic_optimizations_raw_buffer.ll
  llvm/test/CodeGen/AMDGPU/atomic_optimizations_struct_buffer.ll
  llvm/test/CodeGen/AMDGPU/llvm.amdgcn.ballot.i32.ll
  llvm/test/CodeGen/AMDGPU/llvm.amdgcn.ballot.i64.ll
  llvm/test/Transforms/InstCombine/AMDGPU/amdgcn-intrinsics.ll

-------------- next part --------------
A non-text attachment was scrubbed...
Name: D65088.249563.patch
Type: text/x-patch
Size: 130318 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20200311/c0c82e50/attachment-0001.bin>