[PATCH] D22092: AMDGPU: Reduce the duration of whole-quad-mode

Nicolai Hähnle via llvm-commits llvm-commits at lists.llvm.org
Thu Jul 7 07:34:56 PDT 2016


nhaehnle created this revision.
nhaehnle added reviewers: arsenm, tstellarAMD, mareko.
nhaehnle added a subscriber: llvm-commits.
Herald added subscribers: kzhuravl, arsenm.

This contains two changes that reduce the time spent in WQM, with the
intention of reducing bandwidth required by VMEM loads:

1. Sampling instructions by themselves don't need to run in WQM, only their
   coordinate inputs need it (unless of course there is a dependent sampling
   instruction). The initial scanInstructions step is modified accordingly.

2. When switching back from WQM to Exact, switch back as soon as possible.
   This affects the logic in processBlock.

This should always be a win or at best neutral.

There are also some cleanups (e.g. remove unused ExecExports) and some new
debugging output.

http://reviews.llvm.org/D22092

Files:
  lib/Target/AMDGPU/SIWholeQuadMode.cpp
  test/CodeGen/AMDGPU/wqm.ll

-------------- next part --------------
A non-text attachment was scrubbed...
Name: D22092.63071.patch
Type: text/x-patch
Size: 16770 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20160707/dbc0bfc7/attachment.bin>


More information about the llvm-commits mailing list