[llvm] [TableGen][NFCI] Speed up generating *GenRegisterInfo.inc files on builds with expensive checks. (PR #67340)

Ivan Kosarev via llvm-commits llvm-commits at lists.llvm.org
Mon Sep 25 08:06:40 PDT 2023


https://github.com/kosarev created https://github.com/llvm/llvm-project/pull/67340

This is mostly AMDGPU-specific. When the expensive checks are enabled, generating of AMDGPUGenRegisterInfo.inc currently takes about 20 minutes on my machine for release+asserts builds, which effectively prevents such testing from regular use. This patch fixes this by reducing the time to about 2 minutes.

Generation times for AMDGPUGenRegisterInfo.inc without expensive checks and other *GenRegisterInfo.inc files with and without the expensive checks remain approximately the same.

The patch doesn't cause any changes in the contents of the generated files.

The root cause of the current poor performance is that where glibcxx is used, enabling the expensive checks defines _GLIBCXX_DEBUG, which enables various consistency checks in the library. One such check is in std::binary_search() to make sure the range is ordered. As CodeGenRegisterClass::contains() relies on std::binary_search() and it is called very a large number of times from within CodeGenRegBank::inferMatchingSuperRegClass(), the libcxx checks heavily affect the runtimes.

>From 2d2de66dda47736dd007a748d8cab648d0876a03 Mon Sep 17 00:00:00 2001
From: Ivan Kosarev <ivan.kosarev at amd.com>
Date: Mon, 25 Sep 2023 15:03:06 +0100
Subject: [PATCH] [TableGen][NFCI] Speed up generating *GenRegisterInfo.inc
 files on builds with expensive checks.

This is mostly AMDGPU-specific. When the expensive checks are enabled,
generating of AMDGPUGenRegisterInfo.inc currently takes about 20 minutes
on my machine for release+asserts builds, which effectively prevents such
testing from regular use. This patch fixes this by reducing the time to
about 2 minutes.

Generation times for AMDGPUGenRegisterInfo.inc without expensive checks
and other *GenRegisterInfo.inc files with and without the expensive
checks remain approximately the same.

The patch doesn't cause any changes in the contents of the generated
files.

The root cause of the current poor performance is that where glibcxx is
used, enabling the expensive checks defines _GLIBCXX_DEBUG, which enables
various consistency checks in the library. One such check is in
std::binary_search() to make sure the range is ordered. As
CodeGenRegisterClass::contains() relies on std::binary_search() and it is
called very a large number of times from within
CodeGenRegBank::inferMatchingSuperRegClass(), the libcxx checks heavily
affect the runtimes.
---
 llvm/utils/TableGen/CodeGenRegisters.cpp | 20 ++++++++++++--------
 1 file changed, 12 insertions(+), 8 deletions(-)

diff --git a/llvm/utils/TableGen/CodeGenRegisters.cpp b/llvm/utils/TableGen/CodeGenRegisters.cpp
index bbe0bc9eafebbbe..2d5f5c841a174af 100644
--- a/llvm/utils/TableGen/CodeGenRegisters.cpp
+++ b/llvm/utils/TableGen/CodeGenRegisters.cpp
@@ -2297,8 +2297,8 @@ void CodeGenRegBank::inferSubClassWithSubReg(CodeGenRegisterClass *RC) {
 
 void CodeGenRegBank::inferMatchingSuperRegClass(CodeGenRegisterClass *RC,
                                                 std::list<CodeGenRegisterClass>::iterator FirstSubRegRC) {
-  SmallVector<std::pair<const CodeGenRegister*,
-                        const CodeGenRegister*>, 16> SSPairs;
+  DenseMap<const CodeGenRegister *, std::vector<const CodeGenRegister *>>
+      SubToSuperRegs;
   BitVector TopoSigs(getNumTopoSigs());
 
   // Iterate in SubRegIndex numerical order to visit synthetic indices last.
@@ -2310,12 +2310,12 @@ void CodeGenRegBank::inferMatchingSuperRegClass(CodeGenRegisterClass *RC,
       continue;
 
     // Build list of (Super, Sub) pairs for this SubIdx.
-    SSPairs.clear();
+    SubToSuperRegs.clear();
     TopoSigs.reset();
     for (const auto Super : RC->getMembers()) {
       const CodeGenRegister *Sub = Super->getSubRegs().find(&SubIdx)->second;
       assert(Sub && "Missing sub-register");
-      SSPairs.push_back(std::make_pair(Super, Sub));
+      SubToSuperRegs[Sub].push_back(Super);
       TopoSigs.set(Sub->getTopoSig());
     }
 
@@ -2334,16 +2334,20 @@ void CodeGenRegBank::inferMatchingSuperRegClass(CodeGenRegisterClass *RC,
         continue;
       // Compute the subset of RC that maps into SubRC.
       CodeGenRegister::Vec SubSetVec;
-      for (unsigned i = 0, e = SSPairs.size(); i != e; ++i)
-        if (SubRC.contains(SSPairs[i].second))
-          SubSetVec.push_back(SSPairs[i].first);
+      for (const CodeGenRegister *R : SubRC.getMembers()) {
+        auto It = SubToSuperRegs.find(R);
+        if (It != SubToSuperRegs.end()) {
+          const std::vector<const CodeGenRegister *> &SuperRegs = It->second;
+          SubSetVec.insert(SubSetVec.end(), SuperRegs.begin(), SuperRegs.end());
+        }
+      }
 
       if (SubSetVec.empty())
         continue;
 
       // RC injects completely into SubRC.
       sortAndUniqueRegisters(SubSetVec);
-      if (SubSetVec.size() == SSPairs.size()) {
+      if (SubSetVec.size() == RC->getMembers().size()) {
         SubRC.addSuperRegClass(&SubIdx, RC);
         continue;
       }



More information about the llvm-commits mailing list