[llvm] [NFC] Switch a number of DenseMaps to SmallDenseMaps for speedup (PR #109417)
Jeremy Morse via llvm-commits
llvm-commits at lists.llvm.org
Fri Sep 20 06:16:21 PDT 2024
https://github.com/jmorse created https://github.com/llvm/llvm-project/pull/109417
tl;dr, if we use SmallDenseMaps instead of DenseMaps at these locations, we get a substantial speedup because there's less spurious malloc traffic:
https://llvm-compile-time-tracker.com/compare.php?from=983635c014767a2a37f87a4b564b59a78b866154&to=1198cb4aa9c1715d18212a21328801b6985b8ca7&stat=instructions:u
Background: inspired by @SLTozer 's introspective collection of stacktraces for some debug-info things, I've instrumented DenseMap to print where it was allocated and the max number of elements it contained. Run over CTMark and with the addition of some filtering, this has picked out the locations in LLVM where we allocate a DenseMap hashtable off the heap but we could instead get away with using the inline buckets of a SmallDenseMap and avoid calling malloc. I picked 16 inline elements at callsites which occasionally have more than 12 elements inserted, and four inline elements for some callsites where there typically aren't any elements inserted.
One drawback of this technique is that it's fully tuned to making the compile-time-tracker happy, so might not be representative in general. Counterpoints would be that CTMark is chosen to have a range of different inputs and is vaguely representative, avoiding allocations is almost always a win, and in scenarios where we will /always/ insert at least one element it makes sense to spend a little stack memory to avoid that.
(I've got two more patches that contribute another ~0.3% speedup, but it's now hit diminishing returns).
>From b3a4bff74f67be5185641169fa295bceb4a872ce Mon Sep 17 00:00:00 2001
From: Jeremy Morse <jeremy.morse at sony.com>
Date: Thu, 19 Sep 2024 13:01:34 +0100
Subject: [PATCH] [NFC] Switch a number of DenseMaps to SmallDenseMaps
Some instrumentation of densemap allocations has indicated that these are
the variables that most often are filled by a small number of elements,
and thus will benefit the most from having inline elements. I picked 16
inline elements at callsites which occasionally have more than 12 elements
inserted, four inline elements for callsites where there typically aren't
any elements inserted.
---
.../llvm/Analysis/MemoryDependenceAnalysis.h | 2 +-
.../include/llvm/Analysis/SparsePropagation.h | 6 ++--
.../lib/Analysis/MemoryDependenceAnalysis.cpp | 4 +--
llvm/lib/Analysis/ScalarEvolution.cpp | 4 +--
llvm/lib/CodeGen/CalcSpillWeights.cpp | 2 +-
llvm/lib/CodeGen/MachineLICM.cpp | 10 +++---
.../lib/CodeGen/SelectionDAG/InstrEmitter.cpp | 34 +++++++++---------
llvm/lib/CodeGen/SelectionDAG/InstrEmitter.h | 35 ++++++++++---------
.../CodeGen/SelectionDAG/ScheduleDAGFast.cpp | 2 +-
.../SelectionDAG/ScheduleDAGSDNodes.cpp | 12 +++----
.../CodeGen/SelectionDAG/ScheduleDAGSDNodes.h | 2 +-
.../Transforms/IPO/CalledValuePropagation.cpp | 14 ++++----
llvm/lib/Transforms/Utils/BasicBlockUtils.cpp | 4 +--
.../Transforms/Vectorize/SLPVectorizer.cpp | 8 ++---
14 files changed, 70 insertions(+), 69 deletions(-)
diff --git a/llvm/include/llvm/Analysis/MemoryDependenceAnalysis.h b/llvm/include/llvm/Analysis/MemoryDependenceAnalysis.h
index decb33e6af6bcb..c31e663498d5f3 100644
--- a/llvm/include/llvm/Analysis/MemoryDependenceAnalysis.h
+++ b/llvm/include/llvm/Analysis/MemoryDependenceAnalysis.h
@@ -492,7 +492,7 @@ class MemoryDependenceResults {
const MemoryLocation &Loc, bool isLoad,
BasicBlock *BB,
SmallVectorImpl<NonLocalDepResult> &Result,
- DenseMap<BasicBlock *, Value *> &Visited,
+ SmallDenseMap<BasicBlock *, Value *, 16> &Visited,
bool SkipFirstBlock = false,
bool IsIncomplete = false);
MemDepResult getNonLocalInfoForBlock(Instruction *QueryInst,
diff --git a/llvm/include/llvm/Analysis/SparsePropagation.h b/llvm/include/llvm/Analysis/SparsePropagation.h
index d5805a7314757f..194f4787a8de91 100644
--- a/llvm/include/llvm/Analysis/SparsePropagation.h
+++ b/llvm/include/llvm/Analysis/SparsePropagation.h
@@ -89,7 +89,7 @@ template <class LatticeKey, class LatticeVal> class AbstractLatticeFunction {
/// \p ChangedValues.
virtual void
ComputeInstructionState(Instruction &I,
- DenseMap<LatticeKey, LatticeVal> &ChangedValues,
+ SmallDenseMap<LatticeKey, LatticeVal, 16> &ChangedValues,
SparseSolver<LatticeKey, LatticeVal> &SS) = 0;
/// PrintLatticeVal - Render the given LatticeVal to the specified stream.
@@ -401,7 +401,7 @@ void SparseSolver<LatticeKey, LatticeVal, KeyInfo>::visitPHINode(PHINode &PN) {
// computed from its incoming values. For example, SSI form stores its sigma
// functions as PHINodes with a single incoming value.
if (LatticeFunc->IsSpecialCasedPHI(&PN)) {
- DenseMap<LatticeKey, LatticeVal> ChangedValues;
+ SmallDenseMap<LatticeKey, LatticeVal, 16> ChangedValues;
LatticeFunc->ComputeInstructionState(PN, ChangedValues, *this);
for (auto &ChangedValue : ChangedValues)
if (ChangedValue.second != LatticeFunc->getUntrackedVal())
@@ -456,7 +456,7 @@ void SparseSolver<LatticeKey, LatticeVal, KeyInfo>::visitInst(Instruction &I) {
// Otherwise, ask the transfer function what the result is. If this is
// something that we care about, remember it.
- DenseMap<LatticeKey, LatticeVal> ChangedValues;
+ SmallDenseMap<LatticeKey, LatticeVal, 16> ChangedValues;
LatticeFunc->ComputeInstructionState(I, ChangedValues, *this);
for (auto &ChangedValue : ChangedValues)
if (ChangedValue.second != LatticeFunc->getUntrackedVal())
diff --git a/llvm/lib/Analysis/MemoryDependenceAnalysis.cpp b/llvm/lib/Analysis/MemoryDependenceAnalysis.cpp
index 79504ca7b73c8f..c5fba184cd0850 100644
--- a/llvm/lib/Analysis/MemoryDependenceAnalysis.cpp
+++ b/llvm/lib/Analysis/MemoryDependenceAnalysis.cpp
@@ -888,7 +888,7 @@ void MemoryDependenceResults::getNonLocalPointerDependency(
// each block. Because of critical edges, we currently bail out if querying
// a block with multiple different pointers. This can happen during PHI
// translation.
- DenseMap<BasicBlock *, Value *> Visited;
+ SmallDenseMap<BasicBlock *, Value *, 16> Visited;
if (getNonLocalPointerDepFromBB(QueryInst, Address, Loc, isLoad, FromBB,
Result, Visited, true))
return;
@@ -1038,7 +1038,7 @@ bool MemoryDependenceResults::getNonLocalPointerDepFromBB(
Instruction *QueryInst, const PHITransAddr &Pointer,
const MemoryLocation &Loc, bool isLoad, BasicBlock *StartBB,
SmallVectorImpl<NonLocalDepResult> &Result,
- DenseMap<BasicBlock *, Value *> &Visited, bool SkipFirstBlock,
+ SmallDenseMap<BasicBlock *, Value *, 16> &Visited, bool SkipFirstBlock,
bool IsIncomplete) {
// Look up the cached info for Pointer.
ValueIsLoadPair CacheKey(Pointer.getAddr(), isLoad);
diff --git a/llvm/lib/Analysis/ScalarEvolution.cpp b/llvm/lib/Analysis/ScalarEvolution.cpp
index 1d3443588ce60d..b2c2944c57978d 100644
--- a/llvm/lib/Analysis/ScalarEvolution.cpp
+++ b/llvm/lib/Analysis/ScalarEvolution.cpp
@@ -2255,7 +2255,7 @@ const SCEV *ScalarEvolution::getAnyExtendExpr(const SCEV *Op,
/// the common case where no interesting opportunities are present, and
/// is also used as a check to avoid infinite recursion.
static bool
-CollectAddOperandsWithScales(DenseMap<const SCEV *, APInt> &M,
+CollectAddOperandsWithScales(SmallDenseMap<const SCEV *, APInt, 16> &M,
SmallVectorImpl<const SCEV *> &NewOps,
APInt &AccumulatedConstant,
ArrayRef<const SCEV *> Ops, const APInt &Scale,
@@ -2753,7 +2753,7 @@ const SCEV *ScalarEvolution::getAddExpr(SmallVectorImpl<const SCEV *> &Ops,
// operands multiplied by constant values.
if (Idx < Ops.size() && isa<SCEVMulExpr>(Ops[Idx])) {
uint64_t BitWidth = getTypeSizeInBits(Ty);
- DenseMap<const SCEV *, APInt> M;
+ SmallDenseMap<const SCEV *, APInt, 16> M;
SmallVector<const SCEV *, 8> NewOps;
APInt AccumulatedConstant(BitWidth, 0);
if (CollectAddOperandsWithScales(M, NewOps, AccumulatedConstant,
diff --git a/llvm/lib/CodeGen/CalcSpillWeights.cpp b/llvm/lib/CodeGen/CalcSpillWeights.cpp
index 9d8c9119f7719d..88ed2291313c95 100644
--- a/llvm/lib/CodeGen/CalcSpillWeights.cpp
+++ b/llvm/lib/CodeGen/CalcSpillWeights.cpp
@@ -222,7 +222,7 @@ float VirtRegAuxInfo::weightCalcHelper(LiveInterval &LI, SlotIndex *Start,
bool IsExiting = false;
std::set<CopyHint> CopyHints;
- DenseMap<unsigned, float> Hint;
+ SmallDenseMap<unsigned, float, 8> Hint;
for (MachineRegisterInfo::reg_instr_nodbg_iterator
I = MRI.reg_instr_nodbg_begin(LI.reg()),
E = MRI.reg_instr_nodbg_end();
diff --git a/llvm/lib/CodeGen/MachineLICM.cpp b/llvm/lib/CodeGen/MachineLICM.cpp
index 6768eeeb4364c8..c1f3d5ac4ff957 100644
--- a/llvm/lib/CodeGen/MachineLICM.cpp
+++ b/llvm/lib/CodeGen/MachineLICM.cpp
@@ -239,7 +239,7 @@ namespace {
bool IsCheapInstruction(MachineInstr &MI) const;
- bool CanCauseHighRegPressure(const DenseMap<unsigned, int> &Cost,
+ bool CanCauseHighRegPressure(const SmallDenseMap<unsigned, int> &Cost,
bool Cheap);
void UpdateBackTraceRegPressure(const MachineInstr *MI);
@@ -264,7 +264,7 @@ namespace {
void InitRegPressure(MachineBasicBlock *BB);
- DenseMap<unsigned, int> calcRegisterCost(const MachineInstr *MI,
+ SmallDenseMap<unsigned, int> calcRegisterCost(const MachineInstr *MI,
bool ConsiderSeen,
bool ConsiderUnseenAsDef);
@@ -977,10 +977,10 @@ void MachineLICMImpl::UpdateRegPressure(const MachineInstr *MI,
/// If 'ConsiderSeen' is true, updates 'RegSeen' and uses the information to
/// figure out which usages are live-ins.
/// FIXME: Figure out a way to consider 'RegSeen' from all code paths.
-DenseMap<unsigned, int>
+SmallDenseMap<unsigned, int>
MachineLICMImpl::calcRegisterCost(const MachineInstr *MI, bool ConsiderSeen,
bool ConsiderUnseenAsDef) {
- DenseMap<unsigned, int> Cost;
+ SmallDenseMap<unsigned, int> Cost;
if (MI->isImplicitDef())
return Cost;
for (unsigned i = 0, e = MI->getDesc().getNumOperands(); i != e; ++i) {
@@ -1248,7 +1248,7 @@ bool MachineLICMImpl::IsCheapInstruction(MachineInstr &MI) const {
/// Visit BBs from header to current BB, check if hoisting an instruction of the
/// given cost matrix can cause high register pressure.
bool MachineLICMImpl::CanCauseHighRegPressure(
- const DenseMap<unsigned, int> &Cost, bool CheapInstr) {
+ const SmallDenseMap<unsigned, int>& Cost, bool CheapInstr) {
for (const auto &RPIdAndCost : Cost) {
if (RPIdAndCost.second <= 0)
continue;
diff --git a/llvm/lib/CodeGen/SelectionDAG/InstrEmitter.cpp b/llvm/lib/CodeGen/SelectionDAG/InstrEmitter.cpp
index 53ce21906204c8..738319d44d2a53 100644
--- a/llvm/lib/CodeGen/SelectionDAG/InstrEmitter.cpp
+++ b/llvm/lib/CodeGen/SelectionDAG/InstrEmitter.cpp
@@ -83,7 +83,7 @@ static unsigned countOperands(SDNode *Node, unsigned NumExpUses,
/// implicit physical register output.
void InstrEmitter::EmitCopyFromReg(SDNode *Node, unsigned ResNo, bool IsClone,
Register SrcReg,
- DenseMap<SDValue, Register> &VRBaseMap) {
+ VRBaseMapType &VRBaseMap) {
Register VRBase;
if (SrcReg.isVirtual()) {
// Just use the input register directly!
@@ -187,7 +187,7 @@ void InstrEmitter::CreateVirtualRegisters(SDNode *Node,
MachineInstrBuilder &MIB,
const MCInstrDesc &II,
bool IsClone, bool IsCloned,
- DenseMap<SDValue, Register> &VRBaseMap) {
+ VRBaseMapType &VRBaseMap) {
assert(Node->getMachineOpcode() != TargetOpcode::IMPLICIT_DEF &&
"IMPLICIT_DEF should have been handled as a special case elsewhere!");
@@ -266,7 +266,7 @@ void InstrEmitter::CreateVirtualRegisters(SDNode *Node,
/// getVR - Return the virtual register corresponding to the specified result
/// of the specified node.
Register InstrEmitter::getVR(SDValue Op,
- DenseMap<SDValue, Register> &VRBaseMap) {
+ VRBaseMapType &VRBaseMap) {
if (Op.isMachineOpcode() &&
Op.getMachineOpcode() == TargetOpcode::IMPLICIT_DEF) {
// Add an IMPLICIT_DEF instruction before every use.
@@ -280,7 +280,7 @@ Register InstrEmitter::getVR(SDValue Op,
return VReg;
}
- DenseMap<SDValue, Register>::iterator I = VRBaseMap.find(Op);
+ VRBaseMapType::iterator I = VRBaseMap.find(Op);
assert(I != VRBaseMap.end() && "Node emitted out of order - late");
return I->second;
}
@@ -318,7 +318,7 @@ InstrEmitter::AddRegisterOperand(MachineInstrBuilder &MIB,
SDValue Op,
unsigned IIOpNum,
const MCInstrDesc *II,
- DenseMap<SDValue, Register> &VRBaseMap,
+ VRBaseMapType &VRBaseMap,
bool IsDebug, bool IsClone, bool IsCloned) {
assert(Op.getValueType() != MVT::Other &&
Op.getValueType() != MVT::Glue &&
@@ -399,7 +399,7 @@ void InstrEmitter::AddOperand(MachineInstrBuilder &MIB,
SDValue Op,
unsigned IIOpNum,
const MCInstrDesc *II,
- DenseMap<SDValue, Register> &VRBaseMap,
+ VRBaseMapType &VRBaseMap,
bool IsDebug, bool IsClone, bool IsCloned) {
if (Op.isMachineOpcode()) {
AddRegisterOperand(MIB, Op, IIOpNum, II, VRBaseMap,
@@ -500,7 +500,7 @@ Register InstrEmitter::ConstrainForSubReg(Register VReg, unsigned SubIdx,
/// EmitSubregNode - Generate machine code for subreg nodes.
///
void InstrEmitter::EmitSubregNode(SDNode *Node,
- DenseMap<SDValue, Register> &VRBaseMap,
+ VRBaseMapType &VRBaseMap,
bool IsClone, bool IsCloned) {
Register VRBase;
unsigned Opc = Node->getMachineOpcode();
@@ -634,7 +634,7 @@ void InstrEmitter::EmitSubregNode(SDNode *Node,
///
void
InstrEmitter::EmitCopyToRegClassNode(SDNode *Node,
- DenseMap<SDValue, Register> &VRBaseMap) {
+ VRBaseMapType &VRBaseMap) {
Register VReg = getVR(Node->getOperand(0), VRBaseMap);
// Create the new VReg in the destination class and emit a copy.
@@ -654,7 +654,7 @@ InstrEmitter::EmitCopyToRegClassNode(SDNode *Node,
/// EmitRegSequence - Generate machine code for REG_SEQUENCE nodes.
///
void InstrEmitter::EmitRegSequence(SDNode *Node,
- DenseMap<SDValue, Register> &VRBaseMap,
+ VRBaseMapType &VRBaseMap,
bool IsClone, bool IsCloned) {
unsigned DstRCIdx = Node->getConstantOperandVal(0);
const TargetRegisterClass *RC = TRI->getRegClass(DstRCIdx);
@@ -703,7 +703,7 @@ void InstrEmitter::EmitRegSequence(SDNode *Node,
///
MachineInstr *
InstrEmitter::EmitDbgValue(SDDbgValue *SD,
- DenseMap<SDValue, Register> &VRBaseMap) {
+ VRBaseMapType &VRBaseMap) {
DebugLoc DL = SD->getDebugLoc();
assert(cast<DILocalVariable>(SD->getVariable())
->isValidLocationForIntrinsic(DL) &&
@@ -755,7 +755,7 @@ MachineOperand GetMOForConstDbgOp(const SDDbgOperand &Op) {
void InstrEmitter::AddDbgValueLocationOps(
MachineInstrBuilder &MIB, const MCInstrDesc &DbgValDesc,
ArrayRef<SDDbgOperand> LocationOps,
- DenseMap<SDValue, Register> &VRBaseMap) {
+ VRBaseMapType &VRBaseMap) {
for (const SDDbgOperand &Op : LocationOps) {
switch (Op.getKind()) {
case SDDbgOperand::FRAMEIX:
@@ -786,7 +786,7 @@ void InstrEmitter::AddDbgValueLocationOps(
MachineInstr *
InstrEmitter::EmitDbgInstrRef(SDDbgValue *SD,
- DenseMap<SDValue, Register> &VRBaseMap) {
+ VRBaseMapType &VRBaseMap) {
MDNode *Var = SD->getVariable();
const DIExpression *Expr = (DIExpression *)SD->getExpression();
DebugLoc DL = SD->getDebugLoc();
@@ -862,7 +862,7 @@ InstrEmitter::EmitDbgInstrRef(SDDbgValue *SD,
// Look up the corresponding VReg for the given SDNode, if any.
SDNode *Node = DbgOperand.getSDNode();
SDValue Op = SDValue(Node, DbgOperand.getResNo());
- DenseMap<SDValue, Register>::iterator I = VRBaseMap.find(Op);
+ VRBaseMapType::iterator I = VRBaseMap.find(Op);
// No VReg -> produce a DBG_VALUE $noreg instead.
if (I == VRBaseMap.end())
break;
@@ -928,7 +928,7 @@ MachineInstr *InstrEmitter::EmitDbgNoLocation(SDDbgValue *SD) {
MachineInstr *
InstrEmitter::EmitDbgValueList(SDDbgValue *SD,
- DenseMap<SDValue, Register> &VRBaseMap) {
+ VRBaseMapType &VRBaseMap) {
MDNode *Var = SD->getVariable();
DIExpression *Expr = SD->getExpression();
DebugLoc DL = SD->getDebugLoc();
@@ -944,7 +944,7 @@ InstrEmitter::EmitDbgValueList(SDDbgValue *SD,
MachineInstr *
InstrEmitter::EmitDbgValueFromSingleOp(SDDbgValue *SD,
- DenseMap<SDValue, Register> &VRBaseMap) {
+ VRBaseMapType &VRBaseMap) {
MDNode *Var = SD->getVariable();
DIExpression *Expr = SD->getExpression();
DebugLoc DL = SD->getDebugLoc();
@@ -996,7 +996,7 @@ InstrEmitter::EmitDbgLabel(SDDbgLabel *SD) {
///
void InstrEmitter::
EmitMachineNode(SDNode *Node, bool IsClone, bool IsCloned,
- DenseMap<SDValue, Register> &VRBaseMap) {
+ VRBaseMapType &VRBaseMap) {
unsigned Opc = Node->getMachineOpcode();
// Handle subreg insert/extract specially
@@ -1238,7 +1238,7 @@ EmitMachineNode(SDNode *Node, bool IsClone, bool IsCloned,
/// needed dependencies.
void InstrEmitter::
EmitSpecialNode(SDNode *Node, bool IsClone, bool IsCloned,
- DenseMap<SDValue, Register> &VRBaseMap) {
+ VRBaseMapType &VRBaseMap) {
switch (Node->getOpcode()) {
default:
#ifndef NDEBUG
diff --git a/llvm/lib/CodeGen/SelectionDAG/InstrEmitter.h b/llvm/lib/CodeGen/SelectionDAG/InstrEmitter.h
index 959bce31c8b278..fcfcaa8d35b848 100644
--- a/llvm/lib/CodeGen/SelectionDAG/InstrEmitter.h
+++ b/llvm/lib/CodeGen/SelectionDAG/InstrEmitter.h
@@ -30,6 +30,8 @@ class TargetLowering;
class TargetMachine;
class LLVM_LIBRARY_VISIBILITY InstrEmitter {
+ using VRBaseMapType = SmallDenseMap<SDValue, Register, 16>;
+
MachineFunction *MF;
MachineRegisterInfo *MRI;
const TargetInstrInfo *TII;
@@ -45,18 +47,17 @@ class LLVM_LIBRARY_VISIBILITY InstrEmitter {
/// EmitCopyFromReg - Generate machine code for an CopyFromReg node or an
/// implicit physical register output.
void EmitCopyFromReg(SDNode *Node, unsigned ResNo, bool IsClone,
- Register SrcReg, DenseMap<SDValue, Register> &VRBaseMap);
+ Register SrcReg, VRBaseMapType &VRBaseMap);
void CreateVirtualRegisters(SDNode *Node,
MachineInstrBuilder &MIB,
const MCInstrDesc &II,
bool IsClone, bool IsCloned,
- DenseMap<SDValue, Register> &VRBaseMap);
+ VRBaseMapType &VRBaseMap);
/// getVR - Return the virtual register corresponding to the specified result
/// of the specified node.
- Register getVR(SDValue Op,
- DenseMap<SDValue, Register> &VRBaseMap);
+ Register getVR(SDValue Op, VRBaseMapType &VRBaseMap);
/// AddRegisterOperand - Add the specified register as an operand to the
/// specified machine instr. Insert register copies if the register is
@@ -65,7 +66,7 @@ class LLVM_LIBRARY_VISIBILITY InstrEmitter {
SDValue Op,
unsigned IIOpNum,
const MCInstrDesc *II,
- DenseMap<SDValue, Register> &VRBaseMap,
+ VRBaseMapType &VRBaseMap,
bool IsDebug, bool IsClone, bool IsCloned);
/// AddOperand - Add the specified operand to the specified machine instr. II
@@ -76,7 +77,7 @@ class LLVM_LIBRARY_VISIBILITY InstrEmitter {
SDValue Op,
unsigned IIOpNum,
const MCInstrDesc *II,
- DenseMap<SDValue, Register> &VRBaseMap,
+ VRBaseMapType &VRBaseMap,
bool IsDebug, bool IsClone, bool IsCloned);
/// ConstrainForSubReg - Try to constrain VReg to a register class that
@@ -87,7 +88,7 @@ class LLVM_LIBRARY_VISIBILITY InstrEmitter {
/// EmitSubregNode - Generate machine code for subreg nodes.
///
- void EmitSubregNode(SDNode *Node, DenseMap<SDValue, Register> &VRBaseMap,
+ void EmitSubregNode(SDNode *Node, VRBaseMapType &VRBaseMap,
bool IsClone, bool IsCloned);
/// EmitCopyToRegClassNode - Generate machine code for COPY_TO_REGCLASS nodes.
@@ -95,11 +96,11 @@ class LLVM_LIBRARY_VISIBILITY InstrEmitter {
/// register is constrained to be in a particular register class.
///
void EmitCopyToRegClassNode(SDNode *Node,
- DenseMap<SDValue, Register> &VRBaseMap);
+ VRBaseMapType &VRBaseMap);
/// EmitRegSequence - Generate machine code for REG_SEQUENCE nodes.
///
- void EmitRegSequence(SDNode *Node, DenseMap<SDValue, Register> &VRBaseMap,
+ void EmitRegSequence(SDNode *Node, VRBaseMapType &VRBaseMap,
bool IsClone, bool IsCloned);
public:
/// CountResults - The results of target nodes have register or immediate
@@ -110,29 +111,29 @@ class LLVM_LIBRARY_VISIBILITY InstrEmitter {
void AddDbgValueLocationOps(MachineInstrBuilder &MIB,
const MCInstrDesc &DbgValDesc,
ArrayRef<SDDbgOperand> Locations,
- DenseMap<SDValue, Register> &VRBaseMap);
+ VRBaseMapType &VRBaseMap);
/// EmitDbgValue - Generate machine instruction for a dbg_value node.
///
MachineInstr *EmitDbgValue(SDDbgValue *SD,
- DenseMap<SDValue, Register> &VRBaseMap);
+ VRBaseMapType &VRBaseMap);
/// Emit a dbg_value as a DBG_INSTR_REF. May produce DBG_VALUE $noreg instead
/// if there is no variable location; alternately a half-formed DBG_INSTR_REF
/// that refers to a virtual register and is corrected later in isel.
MachineInstr *EmitDbgInstrRef(SDDbgValue *SD,
- DenseMap<SDValue, Register> &VRBaseMap);
+ VRBaseMapType &VRBaseMap);
/// Emit a DBG_VALUE $noreg, indicating a variable has no location.
MachineInstr *EmitDbgNoLocation(SDDbgValue *SD);
/// Emit a DBG_VALUE_LIST from the operands to SDDbgValue.
MachineInstr *EmitDbgValueList(SDDbgValue *SD,
- DenseMap<SDValue, Register> &VRBaseMap);
+ VRBaseMapType &VRBaseMap);
/// Emit a DBG_VALUE from the operands to SDDbgValue.
MachineInstr *EmitDbgValueFromSingleOp(SDDbgValue *SD,
- DenseMap<SDValue, Register> &VRBaseMap);
+ VRBaseMapType &VRBaseMap);
/// Generate machine instruction for a dbg_label node.
MachineInstr *EmitDbgLabel(SDDbgLabel *SD);
@@ -140,7 +141,7 @@ class LLVM_LIBRARY_VISIBILITY InstrEmitter {
/// EmitNode - Generate machine code for a node and needed dependencies.
///
void EmitNode(SDNode *Node, bool IsClone, bool IsCloned,
- DenseMap<SDValue, Register> &VRBaseMap) {
+ VRBaseMapType &VRBaseMap) {
if (Node->isMachineOpcode())
EmitMachineNode(Node, IsClone, IsCloned, VRBaseMap);
else
@@ -160,9 +161,9 @@ class LLVM_LIBRARY_VISIBILITY InstrEmitter {
private:
void EmitMachineNode(SDNode *Node, bool IsClone, bool IsCloned,
- DenseMap<SDValue, Register> &VRBaseMap);
+ VRBaseMapType &VRBaseMap);
void EmitSpecialNode(SDNode *Node, bool IsClone, bool IsCloned,
- DenseMap<SDValue, Register> &VRBaseMap);
+ VRBaseMapType &VRBaseMap);
};
} // namespace llvm
diff --git a/llvm/lib/CodeGen/SelectionDAG/ScheduleDAGFast.cpp b/llvm/lib/CodeGen/SelectionDAG/ScheduleDAGFast.cpp
index de4a1ac2a3bafc..619f622de87963 100644
--- a/llvm/lib/CodeGen/SelectionDAG/ScheduleDAGFast.cpp
+++ b/llvm/lib/CodeGen/SelectionDAG/ScheduleDAGFast.cpp
@@ -770,7 +770,7 @@ void ScheduleDAGLinearize::Schedule() {
MachineBasicBlock*
ScheduleDAGLinearize::EmitSchedule(MachineBasicBlock::iterator &InsertPos) {
InstrEmitter Emitter(DAG->getTarget(), BB, InsertPos);
- DenseMap<SDValue, Register> VRBaseMap;
+ SmallDenseMap<SDValue, Register, 16> VRBaseMap;
LLVM_DEBUG({ dbgs() << "\n*** Final schedule ***\n"; });
diff --git a/llvm/lib/CodeGen/SelectionDAG/ScheduleDAGSDNodes.cpp b/llvm/lib/CodeGen/SelectionDAG/ScheduleDAGSDNodes.cpp
index 53dd71d173473c..99016d8013499c 100644
--- a/llvm/lib/CodeGen/SelectionDAG/ScheduleDAGSDNodes.cpp
+++ b/llvm/lib/CodeGen/SelectionDAG/ScheduleDAGSDNodes.cpp
@@ -737,7 +737,7 @@ void ScheduleDAGSDNodes::VerifyScheduledSequence(bool isBottomUp) {
static void
ProcessSDDbgValues(SDNode *N, SelectionDAG *DAG, InstrEmitter &Emitter,
SmallVectorImpl<std::pair<unsigned, MachineInstr*> > &Orders,
- DenseMap<SDValue, Register> &VRBaseMap, unsigned Order) {
+ SmallDenseMap<SDValue, Register, 16> &VRBaseMap, unsigned Order) {
if (!N->getHasDebugValue())
return;
@@ -782,7 +782,7 @@ ProcessSDDbgValues(SDNode *N, SelectionDAG *DAG, InstrEmitter &Emitter,
// instructions in the right order.
static void
ProcessSourceNode(SDNode *N, SelectionDAG *DAG, InstrEmitter &Emitter,
- DenseMap<SDValue, Register> &VRBaseMap,
+ SmallDenseMap<SDValue, Register, 16> &VRBaseMap,
SmallVectorImpl<std::pair<unsigned, MachineInstr *>> &Orders,
SmallSet<Register, 8> &Seen, MachineInstr *NewInsn) {
unsigned Order = N->getIROrder();
@@ -808,7 +808,7 @@ ProcessSourceNode(SDNode *N, SelectionDAG *DAG, InstrEmitter &Emitter,
}
void ScheduleDAGSDNodes::
-EmitPhysRegCopy(SUnit *SU, DenseMap<SUnit*, Register> &VRBaseMap,
+EmitPhysRegCopy(SUnit *SU, SmallDenseMap<SUnit*, Register, 16> &VRBaseMap,
MachineBasicBlock::iterator InsertPos) {
for (const SDep &Pred : SU->Preds) {
if (Pred.isCtrl())
@@ -851,8 +851,8 @@ EmitPhysRegCopy(SUnit *SU, DenseMap<SUnit*, Register> &VRBaseMap,
MachineBasicBlock *ScheduleDAGSDNodes::
EmitSchedule(MachineBasicBlock::iterator &InsertPos) {
InstrEmitter Emitter(DAG->getTarget(), BB, InsertPos);
- DenseMap<SDValue, Register> VRBaseMap;
- DenseMap<SUnit*, Register> CopyVRBaseMap;
+ SmallDenseMap<SDValue, Register, 16> VRBaseMap;
+ SmallDenseMap<SUnit*, Register, 16> CopyVRBaseMap;
SmallVector<std::pair<unsigned, MachineInstr*>, 32> Orders;
SmallSet<Register, 8> Seen;
bool HasDbg = DAG->hasDebugValues();
@@ -861,7 +861,7 @@ EmitSchedule(MachineBasicBlock::iterator &InsertPos) {
// Zero, one, or multiple instructions can be created when emitting a node.
auto EmitNode =
[&](SDNode *Node, bool IsClone, bool IsCloned,
- DenseMap<SDValue, Register> &VRBaseMap) -> MachineInstr * {
+ SmallDenseMap<SDValue, Register, 16> &VRBaseMap) -> MachineInstr * {
// Fetch instruction prior to this, or end() if nonexistant.
auto GetPrevInsn = [&](MachineBasicBlock::iterator I) {
if (I == BB->begin())
diff --git a/llvm/lib/CodeGen/SelectionDAG/ScheduleDAGSDNodes.h b/llvm/lib/CodeGen/SelectionDAG/ScheduleDAGSDNodes.h
index 446df640821d8b..2811b41fc74331 100644
--- a/llvm/lib/CodeGen/SelectionDAG/ScheduleDAGSDNodes.h
+++ b/llvm/lib/CodeGen/SelectionDAG/ScheduleDAGSDNodes.h
@@ -184,7 +184,7 @@ class InstrItineraryData;
void BuildSchedUnits();
void AddSchedEdges();
- void EmitPhysRegCopy(SUnit *SU, DenseMap<SUnit*, Register> &VRBaseMap,
+ void EmitPhysRegCopy(SUnit *SU, SmallDenseMap<SUnit*, Register, 16> &VRBaseMap,
MachineBasicBlock::iterator InsertPos);
};
diff --git a/llvm/lib/Transforms/IPO/CalledValuePropagation.cpp b/llvm/lib/Transforms/IPO/CalledValuePropagation.cpp
index acc10f57c29acb..a75a0bbc5301df 100644
--- a/llvm/lib/Transforms/IPO/CalledValuePropagation.cpp
+++ b/llvm/lib/Transforms/IPO/CalledValuePropagation.cpp
@@ -169,7 +169,7 @@ class CVPLatticeFunc
/// just a few kinds of instructions since we're only propagating values that
/// can be called.
void ComputeInstructionState(
- Instruction &I, DenseMap<CVPLatticeKey, CVPLatticeVal> &ChangedValues,
+ Instruction &I, SmallDenseMap<CVPLatticeKey, CVPLatticeVal, 16> &ChangedValues,
SparseSolver<CVPLatticeKey, CVPLatticeVal> &SS) override {
switch (I.getOpcode()) {
case Instruction::Call:
@@ -239,7 +239,7 @@ class CVPLatticeFunc
/// Handle return instructions. The function's return state is the merge of
/// the returned value state and the function's return state.
void visitReturn(ReturnInst &I,
- DenseMap<CVPLatticeKey, CVPLatticeVal> &ChangedValues,
+ SmallDenseMap<CVPLatticeKey, CVPLatticeVal, 16> &ChangedValues,
SparseSolver<CVPLatticeKey, CVPLatticeVal> &SS) {
Function *F = I.getParent()->getParent();
if (F->getReturnType()->isVoidTy())
@@ -255,7 +255,7 @@ class CVPLatticeFunc
/// argument state. The call site state is the merge of the call site state
/// with the returned value state of the called function.
void visitCallBase(CallBase &CB,
- DenseMap<CVPLatticeKey, CVPLatticeVal> &ChangedValues,
+ SmallDenseMap<CVPLatticeKey, CVPLatticeVal, 16> &ChangedValues,
SparseSolver<CVPLatticeKey, CVPLatticeVal> &SS) {
Function *F = CB.getCalledFunction();
auto RegI = CVPLatticeKey(&CB, IPOGrouping::Register);
@@ -299,7 +299,7 @@ class CVPLatticeFunc
/// Handle select instructions. The select instruction state is the merge the
/// true and false value states.
void visitSelect(SelectInst &I,
- DenseMap<CVPLatticeKey, CVPLatticeVal> &ChangedValues,
+ SmallDenseMap<CVPLatticeKey, CVPLatticeVal, 16> &ChangedValues,
SparseSolver<CVPLatticeKey, CVPLatticeVal> &SS) {
auto RegI = CVPLatticeKey(&I, IPOGrouping::Register);
auto RegT = CVPLatticeKey(I.getTrueValue(), IPOGrouping::Register);
@@ -312,7 +312,7 @@ class CVPLatticeFunc
/// variable, we attempt to track the value. The loaded value state is the
/// merge of the loaded value state with the global variable state.
void visitLoad(LoadInst &I,
- DenseMap<CVPLatticeKey, CVPLatticeVal> &ChangedValues,
+ SmallDenseMap<CVPLatticeKey, CVPLatticeVal, 16> &ChangedValues,
SparseSolver<CVPLatticeKey, CVPLatticeVal> &SS) {
auto RegI = CVPLatticeKey(&I, IPOGrouping::Register);
if (auto *GV = dyn_cast<GlobalVariable>(I.getPointerOperand())) {
@@ -328,7 +328,7 @@ class CVPLatticeFunc
/// global variable, we attempt to track the value. The global variable state
/// is the merge of the stored value state with the global variable state.
void visitStore(StoreInst &I,
- DenseMap<CVPLatticeKey, CVPLatticeVal> &ChangedValues,
+ SmallDenseMap<CVPLatticeKey, CVPLatticeVal, 16> &ChangedValues,
SparseSolver<CVPLatticeKey, CVPLatticeVal> &SS) {
auto *GV = dyn_cast<GlobalVariable>(I.getPointerOperand());
if (!GV)
@@ -342,7 +342,7 @@ class CVPLatticeFunc
/// Handle all other instructions. All other instructions are marked
/// overdefined.
void visitInst(Instruction &I,
- DenseMap<CVPLatticeKey, CVPLatticeVal> &ChangedValues,
+ SmallDenseMap<CVPLatticeKey, CVPLatticeVal, 16> &ChangedValues,
SparseSolver<CVPLatticeKey, CVPLatticeVal> &SS) {
// Simply bail if this instruction has no user.
if (I.use_empty())
diff --git a/llvm/lib/Transforms/Utils/BasicBlockUtils.cpp b/llvm/lib/Transforms/Utils/BasicBlockUtils.cpp
index 4144c7993b7e42..77f4f2822d7a17 100644
--- a/llvm/lib/Transforms/Utils/BasicBlockUtils.cpp
+++ b/llvm/lib/Transforms/Utils/BasicBlockUtils.cpp
@@ -503,7 +503,7 @@ static bool removeRedundantDbgInstrsUsingBackwardScan(BasicBlock *BB) {
static bool
DbgVariableRecordsRemoveRedundantDbgInstrsUsingForwardScan(BasicBlock *BB) {
SmallVector<DbgVariableRecord *, 8> ToBeRemoved;
- DenseMap<DebugVariable, std::pair<SmallVector<Value *, 4>, DIExpression *>>
+ SmallDenseMap<DebugVariable, std::pair<SmallVector<Value *, 4>, DIExpression *>, 4>
VariableMap;
for (auto &I : *BB) {
for (DbgVariableRecord &DVR : filterDbgVars(I.getDbgRecordRange())) {
@@ -584,7 +584,7 @@ static bool removeRedundantDbgInstrsUsingForwardScan(BasicBlock *BB) {
return DbgVariableRecordsRemoveRedundantDbgInstrsUsingForwardScan(BB);
SmallVector<DbgValueInst *, 8> ToBeRemoved;
- DenseMap<DebugVariable, std::pair<SmallVector<Value *, 4>, DIExpression *>>
+ SmallDenseMap<DebugVariable, std::pair<SmallVector<Value *, 4>, DIExpression *>, 4>
VariableMap;
for (auto &I : *BB) {
if (DbgValueInst *DVI = dyn_cast<DbgValueInst>(&I)) {
diff --git a/llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp b/llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp
index ddeb97463281c9..068d352dc985f2 100644
--- a/llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp
+++ b/llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp
@@ -6964,7 +6964,7 @@ void BoUpSLP::buildTree_rec(ArrayRef<Value *> VL, unsigned Depth,
auto TryToFindDuplicates = [&](const InstructionsState &S,
bool DoNotFail = false) {
// Check that every instruction appears once in this bundle.
- DenseMap<Value *, unsigned> UniquePositions(VL.size());
+ SmallDenseMap<Value *, unsigned, 16> UniquePositions(VL.size());
for (Value *V : VL) {
if (isConstant(V)) {
ReuseShuffleIndices.emplace_back(
@@ -17726,7 +17726,7 @@ class HorizontalReduction {
for (Value *V : Candidates)
TrackedVals.try_emplace(V, V);
- auto At = [](MapVector<Value *, unsigned> &MV, Value *V) -> unsigned & {
+ auto At = [](SmallMapVector<Value *, unsigned, 16> &MV, Value *V) -> unsigned & {
auto *It = MV.find(V);
assert(It != MV.end() && "Unable to find given key.");
return It->second;
@@ -17813,7 +17813,7 @@ class HorizontalReduction {
RdxKind != RecurKind::FMul &&
RdxKind != RecurKind::FMulAdd;
// Gather same values.
- MapVector<Value *, unsigned> SameValuesCounter;
+ SmallMapVector<Value *, unsigned, 16> SameValuesCounter;
if (IsSupportedHorRdxIdentityOp)
for (Value *V : Candidates) {
Value *OrigV = TrackedToOrig.at(V);
@@ -18426,7 +18426,7 @@ class HorizontalReduction {
/// horizontal reduction analysis.
Value *emitReusedOps(Value *VectorizedValue, IRBuilderBase &Builder,
BoUpSLP &R,
- const MapVector<Value *, unsigned> &SameValuesCounter,
+ const SmallMapVector<Value *, unsigned, 16> &SameValuesCounter,
const DenseMap<Value *, Value *> &TrackedToOrig) {
assert(IsSupportedHorRdxIdentityOp &&
"The optimization of matched scalar identity horizontal reductions "
More information about the llvm-commits
mailing list