[PATCH] D13186: AMDGPU: Make SIInsertWaits about a factor of 4 faster

Fri Sep 25 17:52:27 PDT 2015

arsenm created this revision.
arsenm added a reviewer: tstellarAMD.
arsenm added a subscriber: llvm-commits.
Herald added a subscriber: arsenm.

This was the slowest target custom pass and was spending 80%
of the time in getMinimalPhysRegClass which was called
for every register operand.

Try to use the statically known register class when possible from
the instruction's MCOperandInfo. There are a few pseudo instructions
which are not well behaved with unknown register classes which still
require the expensive physical register class search.

There are a few other possibilities for making this even faster,
such as not inspecting implicit operands. For now those are checked
because it is technically possible to have a scalar load into
exec or vcc which can be implicitly used.

http://reviews.llvm.org/D13186

Files:
  lib/Target/AMDGPU/SIInsertWaits.cpp
  lib/Target/AMDGPU/SIRegisterInfo.cpp

-------------- next part --------------
A non-text attachment was scrubbed...
Name: D13186.35787.patch
Type: text/x-patch
Size: 4207 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20150926/84f55ec9/attachment.bin>