[PATCH] D31161: [AMDGPU] New Waitcnt Insertion Pass
Kannan Narayanan via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Thu Mar 23 17:28:06 PDT 2017
kanarayan updated this revision to Diff 92883.
kanarayan added a comment.
This addresses two issues:
1. SQ_MAX_PGM_VGPRS and other constants are used to map the llvm register map to a data structure internal to this algorithm. For the register ampping used by the algorithm see the comments before the enum. They are maximum values across all targets. It is ideal to use dynamically sized arrays that fit the particular architecture target. As an interim step, I have changed these constants to enum values, asserted that no target has a larger register file in the main entry to this pass. (In response to comments from Tony and Konstantin) I have also updated the getRegInterval call. Please notice that the previous version was identical to the pass used by the old pass except I do the necessary adjustments for the mapping used by this algorithm. In a next step, I will remove the assert and the associated code.
2. Barriers no longer force a zero waitcnt. For GFX9 and above, barrier needs no additional waitcnt. For lesser targets, waitcnts are added only if needed. The following tests already test barrier: LLVM :: CodeGen/AMDGPU/addrspacecast.ll LLVM :: CodeGen/AMDGPU/array-ptr-calc-i32.ll LLVM :: CodeGen/AMDGPU/ds-negative-offset-addressing-mode-loop.ll LLVM :: CodeGen/AMDGPU/ds_read2.ll LLVM :: CodeGen/AMDGPU/indirect-private-64.ll LLVM :: CodeGen/AMDGPU/llvm.amdgcn.s.barrier.ll LLVM :: CodeGen/AMDGPU/local-memory.amdgcn.ll LLVM :: CodeGen/AMDGPU/merge-stores.ll LLVM :: CodeGen/AMDGPU/schedule-vs-if-nested-loop-failure.ll LLVM :: CodeGen/AMDGPU/si-triv-disjoint-mem-access.ll LLVM :: CodeGen/AMDGPU/store-barrier.ll LLVM :: CodeGen/AMDGPU/wait.ll
There is also a CodeGen/AMDGPU/llvm.amdgcn.s.barrier.ll
This patch does not yet address the XNACK changes in https://reviews.llvm.org/D30302
https://reviews.llvm.org/D31161
Files:
lib/Target/AMDGPU/AMDGPU.h
lib/Target/AMDGPU/AMDGPUTargetMachine.cpp
lib/Target/AMDGPU/CMakeLists.txt
lib/Target/AMDGPU/SIInsertWaitcnts.cpp
test/CodeGen/AMDGPU/basic-branch.ll
test/CodeGen/AMDGPU/branch-relaxation.ll
test/CodeGen/AMDGPU/control-flow-fastregalloc.ll
test/CodeGen/AMDGPU/indirect-addressing-si.ll
test/CodeGen/AMDGPU/infinite-loop.ll
test/CodeGen/AMDGPU/llvm.amdgcn.buffer.store.format.ll
test/CodeGen/AMDGPU/llvm.amdgcn.buffer.store.ll
test/CodeGen/AMDGPU/llvm.amdgcn.image.ll
test/CodeGen/AMDGPU/llvm.amdgcn.s.dcache.inv.ll
test/CodeGen/AMDGPU/llvm.amdgcn.s.dcache.inv.vol.ll
test/CodeGen/AMDGPU/llvm.amdgcn.s.dcache.wb.ll
test/CodeGen/AMDGPU/llvm.amdgcn.s.dcache.wb.vol.ll
test/CodeGen/AMDGPU/llvm.amdgcn.s.waitcnt.ll
test/CodeGen/AMDGPU/si-lower-control-flow-unreachable-block.ll
test/CodeGen/AMDGPU/smrd-vccz-bug.ll
test/CodeGen/AMDGPU/spill-m0.ll
test/CodeGen/AMDGPU/valu-i1.ll
-------------- next part --------------
A non-text attachment was scrubbed...
Name: D31161.92883.patch
Type: text/x-patch
Size: 87149 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20170324/5424bc6f/attachment.bin>
More information about the llvm-commits
mailing list