[PATCH] D72620: AMDGPU/GlobalISel: Add documentation for RegisterBankInfo

Matt Arsenault via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Fri Jan 17 16:08:44 PST 2020


arsenm updated this revision to Diff 238918.
arsenm marked 2 inline comments as done.
arsenm added a comment.

Fix typos


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D72620/new/

https://reviews.llvm.org/D72620

Files:
  llvm/lib/Target/AMDGPU/AMDGPURegisterBankInfo.cpp


Index: llvm/lib/Target/AMDGPU/AMDGPURegisterBankInfo.cpp
===================================================================
--- llvm/lib/Target/AMDGPU/AMDGPURegisterBankInfo.cpp
+++ llvm/lib/Target/AMDGPU/AMDGPURegisterBankInfo.cpp
@@ -8,7 +8,64 @@
 /// \file
 /// This file implements the targeting of the RegisterBankInfo class for
 /// AMDGPU.
-/// \todo This should be generated by TableGen.
+///
+/// \par
+///
+/// AMDGPU has unique register bank constraints that require special high level
+/// strategies to deal with. There are two main true physical register banks
+/// VGPR (vector), and SGPR (scalar). Additionally the VCC register bank is a
+/// sort of pseudo-register bank needed to represent SGPRs used in a vector
+/// boolean context. There is also the AGPR bank, which is a special purpose
+/// physical register bank present on some subtargets.
+///
+/// Copying from VGPR to SGPR is generally illegal, unless the value is known to
+/// be uniform. It is generally not valid to legalize operands by inserting
+/// copies as on other targets. Operations which require uniform, SGPR operands
+/// generally require scalarization by repeatedly executing the instruction,
+/// activating each set of lanes using a unique set of input values. This is
+/// referred to as a waterfall loop.
+///
+/// \par Booleans
+///
+/// Booleans (s1 values) requires special consideration. A vector compare result
+/// is naturally a bitmask with one bit per lane, in a 32 or 64-bit
+/// register. These are represented with the VCC bank. During selection, we need
+/// to be able to unambiguously go back from a register class to a register
+/// bank. To distinguish whether an SGPR should use the SGPR or VCC register
+/// bank, we need to know the use context type. An SGPR s1 value always means a
+/// VCC bank value, otherwise it will be the SGPR bank. A scalar compare sets
+/// SCC, which is a 1-bit unaddressable register. This will need to be copied to
+/// a 32-bit virtual register. Taken together, this means we need to adjust the
+/// type of boolean operations to be regbank legal. All SALU booleans need to be
+/// widened to 32-bits, and all VALU booleans need to be s1 values.
+///
+/// A noteworthy exception to the s1-means-vcc rule is for legalization artifact
+/// casts. G_TRUNC s1 results, and G_SEXT/G_ZEXT/G_ANYEXT sources are never vcc
+/// bank. A non-boolean source (such as a truncate from a 1-bit load from
+/// memory) will require a copy to the VCC bank which will require clearing the
+/// high bits and inserting a compare.
+///
+/// \par Constant bus restriction
+///
+/// VALU instructions have a limitation known as the constant bus
+/// restriction. Most VALU instructions can use SGPR operands, but may read at
+/// most 1 SGPR or constant literal value (this to 2 in gfx10 for most
+/// instructions). This is one unique SGPR, so the same SGPR may be used for
+/// multiple operands. From a register bank perspective, any combination of
+/// operands should be legal as an SGPR, but this is contextually dependent on
+/// the SGPR operands all being the same register. There is therefore optimal to
+/// choose the SGPR with the most uses to minimize the number of copies.
+///
+/// We avoid trying to solve this problem in RegBankSelect. Any VALU G_*
+/// operation should have its source operands all mapped to VGPRs (except for
+/// VCC), inserting copies from any SGPR operands. This the most trival legal
+/// mapping. Anything beyond the simplest 1:1 instruction selection would be too
+/// complicated to solve here. Every optimization pattern or instruction
+/// selected to multiple outputs would have to enforce this rule, and there
+/// would be additional complexity in tracking this rule for every G_*
+/// operation. By forcing all inputs to VGPRs, it also simplifies the task of
+/// picking the optimal operand combination from a post-isel optimization pass.
+///
 //===----------------------------------------------------------------------===//
 
 #include "AMDGPURegisterBankInfo.h"


-------------- next part --------------
A non-text attachment was scrubbed...
Name: D72620.238918.patch
Type: text/x-patch
Size: 4043 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20200118/767f150e/attachment-0001.bin>


More information about the llvm-commits mailing list