[PATCH] D31161: [AMDGPU] New Waitcnt Insertion Pass

Mon Mar 20 16:49:21 PDT 2017

kanarayan created this revision.
Herald added subscribers: tpr, dstuttard, tony-tye, yaxunl, mgorny, nhaehnle, wdng, kzhuravl, arsenm, qcolombet, MatzeB.

This pass implements the algorithm deployed by our internal compiler for inserting waitcnt instructions. The pass performs cross basic-block analysis and tracks individual registers, and provides predicted performance improvements over the current implementation.

There are further improvements forthcoming, including relaxing overtly conservative assumptions about LDS access, integration of memory model pass, and more targeted tests for the corners.

https://reviews.llvm.org/D31161

Files:
  lib/Target/AMDGPU/AMDGPU.h
  lib/Target/AMDGPU/AMDGPUTargetMachine.cpp
  lib/Target/AMDGPU/CMakeLists.txt
  lib/Target/AMDGPU/SIInsertWaitcnts.cpp
  test/CodeGen/AMDGPU/basic-branch.ll
  test/CodeGen/AMDGPU/branch-relaxation.ll
  test/CodeGen/AMDGPU/control-flow-fastregalloc.ll
  test/CodeGen/AMDGPU/indirect-addressing-si.ll
  test/CodeGen/AMDGPU/infinite-loop.ll
  test/CodeGen/AMDGPU/llvm.amdgcn.buffer.store.format.ll
  test/CodeGen/AMDGPU/llvm.amdgcn.buffer.store.ll
  test/CodeGen/AMDGPU/llvm.amdgcn.image.ll
  test/CodeGen/AMDGPU/llvm.amdgcn.s.dcache.inv.ll
  test/CodeGen/AMDGPU/llvm.amdgcn.s.dcache.inv.vol.ll
  test/CodeGen/AMDGPU/llvm.amdgcn.s.dcache.wb.ll
  test/CodeGen/AMDGPU/llvm.amdgcn.s.dcache.wb.vol.ll
  test/CodeGen/AMDGPU/llvm.amdgcn.s.waitcnt.ll
  test/CodeGen/AMDGPU/si-lower-control-flow-unreachable-block.ll
  test/CodeGen/AMDGPU/smrd-vccz-bug.ll
  test/CodeGen/AMDGPU/spill-m0.ll
  test/CodeGen/AMDGPU/valu-i1.ll

-------------- next part --------------
A non-text attachment was scrubbed...
Name: D31161.92404.patch
Type: text/x-patch
Size: 86901 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20170320/2d14d5bb/attachment.bin>