[PATCH] D11883: AMDGPU/SI: Better handle s_wait insertion

Axel Davy via llvm-commits llvm-commits at lists.llvm.org
Sun Aug 9 05:54:33 PDT 2015


axeldavy created this revision.
axeldavy added a subscriber: llvm-commits.

We can wait on either VM, EXP or LGKM.
The waits are independent.

Without this patch, a wait inserted because of one of them
would also wait for all the previous others.
This patch makes s_wait only wait for the ones we need for the next instruction.

Here's an example of subtle perf reduction this patch solves:

This is without the patch:

buffer_load_format_xyzw v[8:11], v0, s[44:47], 0 idxen
buffer_load_format_xyzw v[12:15], v0, s[48:51], 0 idxen
s_load_dwordx4 s[44:47], s[8:9], 0xc
s_waitcnt lgkmcnt(0)
buffer_load_format_xyzw v[16:19], v0, s[52:55], 0 idxen
s_load_dwordx4 s[48:51], s[8:9], 0x10
s_waitcnt vmcnt(1)
buffer_load_format_xyzw v[20:23], v0, s[44:47], 0 idxen

The s_waitcnt vmcnt(1) is useless.
The reason it is added is because the last
buffer_load_format_xyzw needs s[44:47], which was issued
by the first s_load_dwordx4. It waits for all VM
before that call to have finished.

Internally after every instruction, 3 counters (for VM, EXP and LGTM)
are updated after every instruction. For example buffer_load_format_xyzw will
increase the VM counter, and s_load_dwordx4 the LGKM one.

Without the patch, for every defined register,
the current 3 counters are stored, and are used to know
how long to wait when an instruction needs the register.

Because of that, the s[44:47] counter includes that to use the register
you need to wait for the previous buffer_load_format_xyzw.

Instead this patch stores only the counters that matter for the register,
and puts zero for the other ones, since we don't need any wait for them.

http://reviews.llvm.org/D11883

Files:
  lib/Target/AMDGPU/SIInsertWaits.cpp
  test/CodeGen/AMDGPU/setcc-opt.ll
  test/CodeGen/AMDGPU/wait.ll
  test/CodeGen/AMDGPU/wait2.ll

-------------- next part --------------
A non-text attachment was scrubbed...
Name: D11883.31615.patch
Type: text/x-patch
Size: 4997 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20150809/83cc0739/attachment.bin>


More information about the llvm-commits mailing list