[PATCH] D132988: Changes to use a string pool for symbols
Lang Hames via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Fri Oct 7 18:15:52 PDT 2022
lhames added a comment.
I love seeing all that `BlockDependenciesMap` code disappearing as a result of this change. :D
Lots of minor changes, but the broad direction of this is _great_! Once this patch lands the next big target should be `LinkGraph::LookupMap` -- we should be able to ditch the StringRefs and pass the `SymbolStringPtr`s right through to the context. That'll avoid a tonne of re-interning on the `ObjectLinkingLayer` side.
Very cool work!
================
Comment at: llvm/include/llvm/ExecutionEngine/JITLink/JITLink.h:1100
/// be undefined, in which case they are assigned a value of 0.
Symbol &addExternalSymbol(StringRef Name, orc::ExecutorAddrDiff Size,
Linkage L) {
----------------
This and the other "add" methods can take `SymbolStringPtr`s now, rather than StringRefs.
This is a place where I wouldn't mind a convenience overload for the StringRef though. I.e.:
```
Symbol &addExternalSymbol(SymbolStringPtr Name, orc::ExecutorAddrDiff Size, Linkage L) {
assert(llvm::count(ExternalSymbols, Name) == 0 && "Duplicate external symbol");
auto &Sym = Symbol::constructExternal(
Allocator, createAddressable(orc::ExecutorAddr(), false),
std::move(Name), Size, L);
ExternalSymbols.insert(&Sym);
return Sym;
}
Symbol &addExternalSymbol(StringRef Name, orc::ExecutorAddrDiff Size, Linkage L) {
return addExternalSymbol(SSP->intern(Name), Size, L);
}
```
================
Comment at: llvm/include/llvm/ExecutionEngine/JITLink/JITLink.h:500
/// Returns the name of this symbol (empty if the symbol is anonymous).
- StringRef getName() const {
- assert((!Name.empty() || getScope() == Scope::Local) &&
+ orc::SymbolStringPtr getName() const {
+ assert((Name || getScope() == Scope::Local) &&
----------------
dblaikie wrote:
> jaredwy wrote:
> > dblaikie wrote:
> > > This could return by const-ref (the same way a large member like a std::vector would return from an accessor by const ref) & then `isNamed` could be removed & the usage code kept the same? (I assume `isNamed` was added to avoid the cost of copying the `SymbolStringPtr` - and if it returns by const ref there's no added cost? Oh, I guess callers still have to add the `*` in to dereference? yeah, but still maybe better than adding this `isNamed` function that's a bit quirky - since it only allows the one operation (equality) without the `SymbolStringPtr` copy, which is a bit niche/special case/inflexible)
> > It was mostly added to aid in readability of the code and avoiding the unnecessary copy but I am not attached to it, I am new to these lands so if it is not the LLVM way to add utility functions i can remove
> Could you show/compare the readability issues you had in mind?
>
> Would `getName` returning by const-ref address the unnecessary copy concerns?
I'd be in favor of having `getName()` return by const-ref and dropping `isNamed` -- this would be consistent with the existing getters/setters.
================
Comment at: llvm/include/llvm/ExecutionEngine/JITLink/TableManager.h:42
dbgs() << " Created" << impl().getSectionName() << "entry for "
- << Target.getName() << ": " << Entry << "\n";
+ << *(Target.getName()) << ": " << Entry << "\n";
});
----------------
Are the parens needed here? I think `*` binds tight enough that you can omit them.
================
Comment at: llvm/lib/ExecutionEngine/JITLink/COFFLinkGraphBuilder.cpp:238-239
*Sym, Sec);
- GSym = &createDLLImportEntry(SymbolName, *ExternalSym);
+ auto SSPName = G->getSymbolStringPool()->intern(SymbolName);
+ GSym = &createDLLImportEntry(SSPName, *ExternalSym);
} else
----------------
I'd include `intern` convenience methods on both the `LinkGraph` and the builder, and then use them to avoid naming `SymbolStringPtr`s that only have one use. I.e.
```
Gym = &createDLLImportEntry(intern(SymbolName), *ExternalSym);
```
OTOH it's always worth naming a SymbolStringPtr that gets used more than once (otherwise you'd have to intern it more than once).
================
Comment at: llvm/lib/ExecutionEngine/JITLink/EHFrameSupport.cpp:388
<< ((*PCBegin)->isExternal() ? "external" : "absolute")
- << " symbol \"" << (*PCBegin)->getName()
+ << " symbol \"" << *((*PCBegin)->getName())
<< "\" -- FDE must be kept alive manually or it will be "
----------------
I think you can drop the outer parens here.
================
Comment at: llvm/lib/ExecutionEngine/JITLink/EHFrameSupport.cpp:541
if (EdgeI->second.Target->hasName())
- dbgs() << " (" << EdgeI->second.Target->getName() << ")";
+ dbgs() << " (" << *(EdgeI->second.Target->getName()) << ")";
dbgs() << "\n";
----------------
I think you can drop them outer parens here too.
================
Comment at: llvm/lib/ExecutionEngine/JITLink/ELF_x86_64.cpp:280
[&](LinkGraph &LG, Symbol &Sym) -> SectionRangeSymbolDesc {
- if (Sym.getName() == ELFGOTSymbolName)
+ if (*Sym.getName() == ELFGOTSymbolName)
if (auto *GOTSection = G.findSectionByName(
----------------
Could lazily intern `ELFGOTSymbolName` into an `ELFJITLinker_x86_64` member variable to speed up repeat comparisons?
No need to do that for this patch, but maybe add a FIXME to follow up on the idea?
================
Comment at: llvm/lib/ExecutionEngine/JITLink/JITLinkGeneric.cpp:203
"External has already been assigned an address");
- assert(Sym->getName() != StringRef() && Sym->getName() != "" &&
+ assert(Sym->hasName() && *Sym->getName() != "" &&
"Externals must be named");
----------------
I think you can now just `assert(Sym->getName() && "Externals must be named")`.
================
Comment at: llvm/lib/ExecutionEngine/JITLink/JITLinkGeneric.cpp:209
: SymbolLookupFlags::RequiredSymbol;
- UnresolvedExternals[Sym->getName()] = LookupFlags;
+ UnresolvedExternals[*Sym->getName()] = LookupFlags;
}
----------------
We should key `UnresolvedExternals` on the `SymbolStringPtr` now, rather than the StringRef.
================
Comment at: llvm/lib/ExecutionEngine/JITLink/PerGraphGOTAndPLTStubsBuilder.h:73
- auto GOTEntryI = GOTEntries.find(Target.getName());
+ auto GOTEntryI = GOTEntries.find(*(Target.getName()));
----------------
I think `GOTEntries` should be able to be keyed on the `SymbolStringPtr` too.
================
Comment at: llvm/lib/ExecutionEngine/JITLink/PerGraphGOTAndPLTStubsBuilder.h:95
"External branch edge can not point to an anonymous target");
- auto StubI = PLTStubs.find(Target.getName());
+ auto StubI = PLTStubs.find(*Target.getName());
----------------
PLTStubs should likewise be keyed on the `SymbolStringPtr`.
================
Comment at: llvm/lib/ExecutionEngine/Orc/COFFPlatform.cpp:67
}
-
+ auto SSP = CP.getExecutionSession().getSymbolStringPool();
auto G = std::make_unique<jitlink::LinkGraph>(
----------------
We need to use the `ExecutionSession`'s `SymbolStringPool` here. I'd update this to take the `ExecutionSession&` -- `COFFPlatform` has an ES member handy that you can pass in.
================
Comment at: llvm/lib/ExecutionEngine/Orc/COFFPlatform.cpp:733
auto I = llvm::find_if(G.defined_symbols(), [this](jitlink::Symbol *Sym) {
- return Sym->getName() == *CP.COFFHeaderStartSymbol;
+ return *Sym->getName() == *CP.COFFHeaderStartSymbol;
});
----------------
Once everyone's sharing the same pool this can be a direct `SymbolStringPtr` value comparison.
================
Comment at: llvm/lib/ExecutionEngine/Orc/ELFNixPlatform.cpp:693
auto I = llvm::find_if(G.defined_symbols(), [this](jitlink::Symbol *Sym) {
- return Sym->getName() == *MP.DSOHandleSymbol;
+ return *Sym->getName() == *MP.DSOHandleSymbol;
});
----------------
Can be a `SSP` comparison once the pool is shared.
================
Comment at: llvm/lib/ExecutionEngine/Orc/ELFNixPlatform.cpp:843-850
+ if (*Sym->getName() == "__tls_get_addr") {
+ auto TLSGetAddr =
+ MP.getExecutionSession().intern("___orc_rt_elfnix_tls_get_addr");
+ Sym->setName(std::move(TLSGetAddr));
+ } else if (*Sym->getName() == "__tlsdesc_resolver") {
+ auto TLSGetAddr =
+ MP.getExecutionSession().intern("___orc_rt_elfnix_tlsdesc_resolver");
----------------
These `tls` symbol strings should be interned and stored in the plugin -- that'll save re-interning them for each graph.
================
Comment at: llvm/lib/ExecutionEngine/Orc/ExecutionUtils.cpp:532-535
auto NameCopy = G->allocateString(Twine(getImpPrefix()) + *KV.first);
StringRef NameCopyRef = StringRef(NameCopy.data(), NameCopy.size());
- Ptr.setName(NameCopyRef);
+ auto SymbolName = G->getSymbolStringPool()->intern(NameCopyRef);
+ Ptr.setName(std::move(SymbolName));
----------------
We shouldn't need to allocate a string on the `LinkGraph` for this any more. Instead, you should just be able to say:
```
Ptr.setName(G->intern(getImplPrefix() + *KV.first));
```
================
Comment at: llvm/lib/ExecutionEngine/Orc/MachOPlatform.cpp:676
auto I = llvm::find_if(G.defined_symbols(), [this](jitlink::Symbol *Sym) {
- return Sym->getName() == *MP.MachOHeaderStartSymbol;
+ return *Sym->getName() == *MP.MachOHeaderStartSymbol;
});
----------------
Can compare SSP values once the pool is shared.
================
Comment at: llvm/lib/ExecutionEngine/Orc/MachOPlatform.cpp:814-817
+ if (*Sym->getName() == "__tlv_bootstrap") {
+ auto TLSGetADDR =
+ MP.getExecutionSession().intern("___orc_rt_macho_tlv_get_addr");
+ Sym->setName(std::move(TLSGetADDR));
----------------
We should intern these `tlv` strings and store them as members in the plugin to avoid redundant interning.
================
Comment at: llvm/lib/ExecutionEngine/Orc/MachOPlatform.cpp:994-997
+ auto SymName = *Sym->getName();
+ if (SymName == "___orc_rt_macho_register_ehframe_section")
orc_rt_macho_register_ehframe_section = ExecutorAddr(Sym->getAddress());
+ else if (SymName == "___orc_rt_macho_deregister_ehframe_section")
----------------
More strings to be interned and stored in the plugin.
================
Comment at: llvm/lib/ExecutionEngine/Orc/ObjectLinkingLayer.cpp:212-225
+ auto InternedName = Sym->getName();
JITSymbolFlags Flags;
if (Sym->isCallable())
Flags |= JITSymbolFlags::Callable;
if (Sym->getScope() == Scope::Default)
Flags |= JITSymbolFlags::Exported;
----------------
Once `Symbol::getName()` returns by const-ref I think you can replace `InternedName` with `Sym->getName()` everywhere here.
================
Comment at: llvm/lib/ExecutionEngine/Orc/ObjectLinkingLayer.cpp:231
if (Sym->hasName() && Sym->getScope() != Scope::Local) {
- auto InternedName = ES.intern(Sym->getName());
+ auto InternedName = Sym->getName();
JITSymbolFlags Flags;
----------------
Ditto here.
================
Comment at: llvm/lib/ExecutionEngine/Orc/ObjectLinkingLayer.cpp:414-420
+ auto Name = Sym->getName();
+ if (!MR->getSymbols().count(Name)) {
JITSymbolFlags SF = JITSymbolFlags::Weak;
if (Sym->getScope() == Scope::Default)
SF |= JITSymbolFlags::Exported;
NewSymbolsToClaim[Name] = SF;
NameToSym.push_back(std::make_pair(std::move(Name), Sym));
----------------
I'd drop `Name` and just use `Sym->getName()` here.
================
Comment at: llvm/lib/ExecutionEngine/Orc/ObjectLinkingLayer.cpp:465-469
+ auto SymName = Sym->getName();
if (!SymDeps.External.empty())
ExternalNamedSymbolDeps[SymName] = SymDeps.External;
if (!SymDeps.Internal.empty())
InternalNamedSymbolDeps[SymName] = SymDeps.Internal;
----------------
Drop `SymName` and just use `Sym->getName()`.
================
Comment at: llvm/unittests/ExecutionEngine/JITLink/LinkGraphTests.cpp:75-89
+TEST(LinkGraphTest, ConstructionWithPool) {
+ auto SSP = std::make_shared<orc::SymbolStringPool>();
+ // Check that LinkGraph construction works as expected.
+ LinkGraph G("foo", SSP, Triple("x86_64-apple-darwin"), 8, support::little,
+ getGenericEdgeKindName);
+ EXPECT_EQ(G.getName(), "foo");
+ EXPECT_EQ(G.getTargetTriple().str(), "x86_64-apple-darwin");
----------------
Once you drop the convenience method this shouldn't be needed.
================
Comment at: llvm/unittests/ExecutionEngine/JITLink/LinkGraphTests.cpp:717
+}
\ No newline at end of file
----------------
Missing newline. :)
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D132988/new/
https://reviews.llvm.org/D132988
More information about the llvm-commits
mailing list