[llvm] [RegAllocFast] Handle single-vdef instrs faster (PR #96284)
Alexis Engelke via llvm-commits
llvm-commits at lists.llvm.org
Fri Jun 21 01:11:33 PDT 2024
https://github.com/aengelke created https://github.com/llvm/llvm-project/pull/96284
On x86, many instructions have tied operands, so allocateInstruction uses the more complex assignment strategy, which computes the assignment order of virtual defs first. This involves iterating over all register classes (or register aliases for physical defs) to compute the possible number of defs per register class.
However, this information is only used for sorting virtual defs and therefore not required when there's only one virtual def -- which is a very common case. As iterating over all register classes/aliases is not cheap, do this only when there's more than one virtual def.
---
I'm wondering, how many instructions besides inlineasm actually need this analysis. When asserting `!SmallClass`, there's only a single inlineasm test case where this case is actually hit. [c-t-t indicates a ~0.8% performance improvement](http://llvm-compile-time-tracker.com/compare.php?from=90779fdc19dc15099231d6ebc39d9d76991d2d43&to=b907bc97c0ff974e323eda06db0468cbd16626e7&stat=instructions:u).
>From b907bc97c0ff974e323eda06db0468cbd16626e7 Mon Sep 17 00:00:00 2001
From: Alexis Engelke <engelke at in.tum.de>
Date: Fri, 21 Jun 2024 09:23:07 +0200
Subject: [PATCH] [RegAllocFast] Handle single-vdef instrs faster
On x86, most instructions have tied operands, so allocateInstruction
uses the more complex assignment strategy which computes the assignment
order of virtual defs first. This involves iterating over all register
classes (or register aliases for physical defs) to compute the possible
number of defs per register class.
However, this information is only used for sorting virtual defs and
therefore not required when there's only one virtual def -- which is a
very common case. As iterating over all register classes/aliases is not
cheap, do this only when there's more than one virtual def.
---
llvm/lib/CodeGen/RegAllocFast.cpp | 29 +++++++++++++++++++----------
1 file changed, 19 insertions(+), 10 deletions(-)
diff --git a/llvm/lib/CodeGen/RegAllocFast.cpp b/llvm/lib/CodeGen/RegAllocFast.cpp
index 09ce8c42a3850..d194445abbfc8 100644
--- a/llvm/lib/CodeGen/RegAllocFast.cpp
+++ b/llvm/lib/CodeGen/RegAllocFast.cpp
@@ -1289,10 +1289,6 @@ void RegAllocFastImpl::addRegClassDefCounts(
void RegAllocFastImpl::findAndSortDefOperandIndexes(const MachineInstr &MI) {
DefOperandIndexes.clear();
- // Track number of defs which may consume a register from the class.
- std::vector<unsigned> RegClassDefCounts(TRI->getNumRegClasses(), 0);
- assert(RegClassDefCounts[0] == 0);
-
LLVM_DEBUG(dbgs() << "Need to assign livethroughs\n");
for (unsigned I = 0, E = MI.getNumOperands(); I < E; ++I) {
const MachineOperand &MO = MI.getOperand(I);
@@ -1306,14 +1302,27 @@ void RegAllocFastImpl::findAndSortDefOperandIndexes(const MachineInstr &MI) {
}
}
- if (MO.isDef()) {
- if (Reg.isVirtual() && shouldAllocateRegister(Reg))
- DefOperandIndexes.push_back(I);
-
- addRegClassDefCounts(RegClassDefCounts, Reg);
- }
+ if (MO.isDef() && Reg.isVirtual() && shouldAllocateRegister(Reg))
+ DefOperandIndexes.push_back(I);
}
+ // Most instructions only have one virtual def, so there's no point in
+ // computing the possible number of defs for every register class.
+ if (DefOperandIndexes.size() <= 1)
+ return;
+
+ // Track number of defs which may consume a register from the class. This is
+ // used to assign registers for possibly-too-small classes first. Example:
+ // defs are eax, 3 * gr32_abcd, 2 * gr32 => we want to assign the gr32_abcd
+ // registers first so that the gr32 don't use the gr32_abcd registers before
+ // we assign these.
+ std::vector<unsigned> RegClassDefCounts(TRI->getNumRegClasses(), 0);
+ assert(RegClassDefCounts[0] == 0);
+
+ for (const MachineOperand &MO : MI.operands())
+ if (MO.isReg() && MO.isDef())
+ addRegClassDefCounts(RegClassDefCounts, MO.getReg());
+
llvm::sort(DefOperandIndexes, [&](uint16_t I0, uint16_t I1) {
const MachineOperand &MO0 = MI.getOperand(I0);
const MachineOperand &MO1 = MI.getOperand(I1);
More information about the llvm-commits
mailing list