[llvm] 963378b - [ORC] Improve computeLocalDeps / computeNamedSymbolDependencies performance.

Lang Hames via llvm-commits llvm-commits at lists.llvm.org
Wed Jul 7 23:40:11 PDT 2021


Author: Lang Hames
Date: 2021-07-08T16:31:59+10:00
New Revision: 963378bd8278220eb382bec76846ef39e4ea597e

URL: https://github.com/llvm/llvm-project/commit/963378bd8278220eb382bec76846ef39e4ea597e
DIFF: https://github.com/llvm/llvm-project/commit/963378bd8278220eb382bec76846ef39e4ea597e.diff

LOG: [ORC] Improve computeLocalDeps / computeNamedSymbolDependencies performance.

The computeNamedSymbolDependencies and computeLocalDeps methods on
ObjectLinkingLayerJITLinkContext are responsible for computing, for each symbol
in the current MaterializationResponsibility, the set of non-locally-scoped
symbols that are depended on. To calculate this we have to consider the effect
of chains of dependence through locally scoped symbols in the LinkGraph. E.g.

        .text
        .globl  foo
foo:
        callq   bar                    ## foo depneds on external 'bar'
        movq    Ltmp1(%rip), %rcx      ## foo depends on locally scoped 'Ltmp1'
        addl    (%rcx), %eax
        retq

        .data
Ltmp1:
        .quad   x                      ## Ltmp1 depends on external 'x'

In this example symbol 'foo' depends directly on 'bar', and indirectly on 'x'
via 'Ltmp1', which is locally scoped.

Performance of the existing implementations appears to have been mediocre:
Based on flame graphs posted by @drmeister (in #jit on the LLVM discord server)
the computeLocalDeps function was taking up a substantial amount of time when
starting up Clasp (https://github.com/clasp-developers/clasp).

This commit attempts to address the performance problems in three ways:

1. Using jitlink::Blocks instead of jitlink::Symbols as the nodes of the
dependencies-introduced-by-locally-scoped-symbols graph.

Using either Blocks or Symbols as nodes provides the same information, but since
there may be more than one locally scoped symbol per block the block-based
version of the dependence graph should always be a subgraph of the Symbol-based
version, and so faster to operate on.

2. Improved worklist management.

The older version of computeLocalDeps used a fixed worklist containing all
nodes, and iterated over this list propagating dependencies until no further
changes were required. The worklist was not sorted into a useful order before
the loop started.

The new version uses a variable work-stack, visiting nodes in DFS order and
only adding nodes when there is meaningful work to do on them.

Compared to the old version the new version avoids revisiting nodes which
haven't changed, and I suspect it converges more quickly (due to the DFS
ordering).

3. Laziness and caching.

Mappings of...

jitlink::Symbol* -> Interned Name (as SymbolStringPtr)
jitlink::Block* -> Immediate dependencies (as SymbolNameSet)
jitlink::Block* -> Transitive dependencies (as SymbolNameSet)

are all built lazily and cached while running computeNamedSymbolDependencies.

According to @drmeister these changes reduced Clasp startup time in his test
setup (averaged over a handful of starts) from 4.8 to 2.8 seconds (with
ORC/JITLink linking ~11,000 object files in that time), which seems like
enough to justify switching to the new algorithm in the absence of any other
perf numbers.

Added: 
    

Modified: 
    llvm/include/llvm/ExecutionEngine/Orc/MachOPlatform.h
    llvm/include/llvm/ExecutionEngine/Orc/ObjectLinkingLayer.h
    llvm/lib/ExecutionEngine/Orc/MachOPlatform.cpp
    llvm/lib/ExecutionEngine/Orc/ObjectLinkingLayer.cpp

Removed: 
    


################################################################################
diff  --git a/llvm/include/llvm/ExecutionEngine/Orc/MachOPlatform.h b/llvm/include/llvm/ExecutionEngine/Orc/MachOPlatform.h
index f9d0b587a1be..9eb2ce33cf81 100644
--- a/llvm/include/llvm/ExecutionEngine/Orc/MachOPlatform.h
+++ b/llvm/include/llvm/ExecutionEngine/Orc/MachOPlatform.h
@@ -111,8 +111,8 @@ class MachOPlatform : public Platform {
                           jitlink::LinkGraph &G,
                           jitlink::PassConfiguration &Config) override;
 
-    LocalDependenciesMap getSyntheticSymbolLocalDependencies(
-        MaterializationResponsibility &MR) override;
+    SyntheticSymbolDependenciesMap
+    getSyntheticSymbolDependencies(MaterializationResponsibility &MR) override;
 
     // FIXME: We should be tentatively tracking scraped sections and discarding
     // if the MR fails.
@@ -129,9 +129,9 @@ class MachOPlatform : public Platform {
 
   private:
     using InitSymbolDepMap =
-        DenseMap<MaterializationResponsibility *, JITLinkSymbolVector>;
+        DenseMap<MaterializationResponsibility *, JITLinkSymbolSet>;
 
-    void preserveInitSectionIfPresent(JITLinkSymbolVector &Syms,
+    void preserveInitSectionIfPresent(JITLinkSymbolSet &Symbols,
                                       jitlink::LinkGraph &G,
                                       StringRef SectionName);
 

diff  --git a/llvm/include/llvm/ExecutionEngine/Orc/ObjectLinkingLayer.h b/llvm/include/llvm/ExecutionEngine/Orc/ObjectLinkingLayer.h
index 55d0634a82ae..3bb83342dcdb 100644
--- a/llvm/include/llvm/ExecutionEngine/Orc/ObjectLinkingLayer.h
+++ b/llvm/include/llvm/ExecutionEngine/Orc/ObjectLinkingLayer.h
@@ -64,8 +64,9 @@ class ObjectLinkingLayer : public RTTIExtends<ObjectLinkingLayer, ObjectLayer>,
   /// configured.
   class Plugin {
   public:
-    using JITLinkSymbolVector = std::vector<const jitlink::Symbol *>;
-    using LocalDependenciesMap = DenseMap<SymbolStringPtr, JITLinkSymbolVector>;
+    using JITLinkSymbolSet = DenseSet<jitlink::Symbol *>;
+    using SyntheticSymbolDependenciesMap =
+        DenseMap<SymbolStringPtr, JITLinkSymbolSet>;
 
     virtual ~Plugin();
     virtual void modifyPassConfig(MaterializationResponsibility &MR,
@@ -89,12 +90,12 @@ class ObjectLinkingLayer : public RTTIExtends<ObjectLinkingLayer, ObjectLayer>,
                                              ResourceKey SrcKey) = 0;
 
     /// Return any dependencies that synthetic symbols (e.g. init symbols)
-    /// have on locally scoped jitlink::Symbols. This is used by the
-    /// ObjectLinkingLayer to update the dependencies for the synthetic
-    /// symbols.
-    virtual LocalDependenciesMap
-    getSyntheticSymbolLocalDependencies(MaterializationResponsibility &MR) {
-      return LocalDependenciesMap();
+    /// have on symbols in the LinkGraph.
+    /// This is used by the ObjectLinkingLayer to update the dependencies for
+    /// the synthetic symbols.
+    virtual SyntheticSymbolDependenciesMap
+    getSyntheticSymbolDependencies(MaterializationResponsibility &MR) {
+      return SyntheticSymbolDependenciesMap();
     }
   };
 

diff  --git a/llvm/lib/ExecutionEngine/Orc/MachOPlatform.cpp b/llvm/lib/ExecutionEngine/Orc/MachOPlatform.cpp
index 74c88b0c1c85..39557a485cf2 100644
--- a/llvm/lib/ExecutionEngine/Orc/MachOPlatform.cpp
+++ b/llvm/lib/ExecutionEngine/Orc/MachOPlatform.cpp
@@ -314,17 +314,14 @@ void MachOPlatform::InitScraperPlugin::modifyPassConfig(
     return;
 
   Config.PrePrunePasses.push_back([this, &MR](jitlink::LinkGraph &G) -> Error {
-    JITLinkSymbolVector InitSectionSymbols;
-    preserveInitSectionIfPresent(InitSectionSymbols, G,
-                                 "__DATA,__mod_init_func");
-    preserveInitSectionIfPresent(InitSectionSymbols, G,
-                                 "__DATA,__objc_selrefs");
-    preserveInitSectionIfPresent(InitSectionSymbols, G,
-                                 "__DATA,__objc_classlist");
-
-    if (!InitSectionSymbols.empty()) {
+    JITLinkSymbolSet InitSectionSyms;
+    preserveInitSectionIfPresent(InitSectionSyms, G, "__DATA,__mod_init_func");
+    preserveInitSectionIfPresent(InitSectionSyms, G, "__DATA,__objc_selrefs");
+    preserveInitSectionIfPresent(InitSectionSyms, G, "__DATA,__objc_classlist");
+
+    if (!InitSectionSyms.empty()) {
       std::lock_guard<std::mutex> Lock(InitScraperMutex);
-      InitSymbolDeps[&MR] = std::move(InitSectionSymbols);
+      InitSymbolDeps[&MR] = std::move(InitSectionSyms);
     }
 
     if (auto Err = processObjCImageInfo(G, MR))
@@ -398,27 +395,26 @@ void MachOPlatform::InitScraperPlugin::modifyPassConfig(
   });
 }
 
-ObjectLinkingLayer::Plugin::LocalDependenciesMap
-MachOPlatform::InitScraperPlugin::getSyntheticSymbolLocalDependencies(
+ObjectLinkingLayer::Plugin::SyntheticSymbolDependenciesMap
+MachOPlatform::InitScraperPlugin::getSyntheticSymbolDependencies(
     MaterializationResponsibility &MR) {
   std::lock_guard<std::mutex> Lock(InitScraperMutex);
   auto I = InitSymbolDeps.find(&MR);
   if (I != InitSymbolDeps.end()) {
-    LocalDependenciesMap Result;
+    SyntheticSymbolDependenciesMap Result;
     Result[MR.getInitializerSymbol()] = std::move(I->second);
     InitSymbolDeps.erase(&MR);
     return Result;
   }
-  return LocalDependenciesMap();
+  return SyntheticSymbolDependenciesMap();
 }
 
 void MachOPlatform::InitScraperPlugin::preserveInitSectionIfPresent(
-    JITLinkSymbolVector &Symbols, jitlink::LinkGraph &G,
-    StringRef SectionName) {
+    JITLinkSymbolSet &Symbols, jitlink::LinkGraph &G, StringRef SectionName) {
   if (auto *Sec = G.findSectionByName(SectionName)) {
     auto SecBlocks = Sec->blocks();
     if (!llvm::empty(SecBlocks))
-      Symbols.push_back(
+      Symbols.insert(
           &G.addAnonymousSymbol(**SecBlocks.begin(), 0, 0, false, true));
   }
 }

diff  --git a/llvm/lib/ExecutionEngine/Orc/ObjectLinkingLayer.cpp b/llvm/lib/ExecutionEngine/Orc/ObjectLinkingLayer.cpp
index c10aa15ef269..a45b18544609 100644
--- a/llvm/lib/ExecutionEngine/Orc/ObjectLinkingLayer.cpp
+++ b/llvm/lib/ExecutionEngine/Orc/ObjectLinkingLayer.cpp
@@ -331,12 +331,82 @@ class ObjectLinkingLayerJITLinkContext final : public JITLinkContext {
   }
 
 private:
-  struct LocalSymbolNamedDependencies {
+  // Symbol name dependencies:
+  // Internal: Defined in this graph.
+  // External: Defined externally.
+  struct BlockSymbolDependencies {
     SymbolNameSet Internal, External;
   };
 
-  using LocalSymbolNamedDependenciesMap =
-      DenseMap<const Symbol *, LocalSymbolNamedDependencies>;
+  // Lazily populated map of blocks to BlockSymbolDependencies values.
+  class BlockDependenciesMap {
+  public:
+    BlockDependenciesMap(ExecutionSession &ES,
+                         DenseMap<const Block *, DenseSet<Block *>> BlockDeps)
+        : ES(ES), BlockDeps(std::move(BlockDeps)) {}
+
+    const BlockSymbolDependencies &operator[](const Block &B) {
+      // Check the cache first.
+      auto I = BlockTransitiveDepsCache.find(&B);
+      if (I != BlockTransitiveDepsCache.end())
+        return I->second;
+
+      // No value. Populate the cache.
+      BlockSymbolDependencies BTDCacheVal;
+      auto BDI = BlockDeps.find(&B);
+      assert(BDI != BlockDeps.end() && "No block dependencies");
+
+      for (auto *BDep : BDI->second) {
+        auto &BID = getBlockImmediateDeps(*BDep);
+        for (auto &ExternalDep : BID.External)
+          BTDCacheVal.External.insert(ExternalDep);
+        for (auto &InternalDep : BID.Internal)
+          BTDCacheVal.Internal.insert(InternalDep);
+      }
+
+      return BlockTransitiveDepsCache
+          .insert(std::make_pair(&B, std::move(BTDCacheVal)))
+          .first->second;
+    }
+
+    SymbolStringPtr &getInternedName(Symbol &Sym) {
+      auto I = NameCache.find(&Sym);
+      if (I != NameCache.end())
+        return I->second;
+
+      return NameCache.insert(std::make_pair(&Sym, ES.intern(Sym.getName())))
+          .first->second;
+    }
+
+  private:
+    BlockSymbolDependencies &getBlockImmediateDeps(Block &B) {
+      // Check the cache first.
+      auto I = BlockImmediateDepsCache.find(&B);
+      if (I != BlockImmediateDepsCache.end())
+        return I->second;
+
+      BlockSymbolDependencies BIDCacheVal;
+      for (auto &E : B.edges()) {
+        auto &Tgt = E.getTarget();
+        if (Tgt.getScope() != Scope::Local) {
+          if (Tgt.isExternal())
+            BIDCacheVal.External.insert(getInternedName(Tgt));
+          else
+            BIDCacheVal.Internal.insert(getInternedName(Tgt));
+        }
+      }
+
+      return BlockImmediateDepsCache
+          .insert(std::make_pair(&B, std::move(BIDCacheVal)))
+          .first->second;
+    }
+
+    ExecutionSession &ES;
+    DenseMap<const Block *, DenseSet<Block *>> BlockDeps;
+    DenseMap<const Symbol *, SymbolStringPtr> NameCache;
+    DenseMap<const Block *, BlockSymbolDependencies> BlockImmediateDepsCache;
+    DenseMap<const Block *, BlockSymbolDependencies> BlockTransitiveDepsCache;
+  };
 
   Error claimOrExternalizeWeakAndCommonSymbols(LinkGraph &G) {
     auto &ES = Layer.getExecutionSession();
@@ -384,7 +454,7 @@ class ObjectLinkingLayerJITLinkContext final : public JITLinkContext {
 
   Error computeNamedSymbolDependencies(LinkGraph &G) {
     auto &ES = MR->getTargetJITDylib().getExecutionSession();
-    auto LocalDeps = computeLocalDeps(G);
+    auto BlockDeps = computeBlockNonLocalDeps(G);
 
     // Compute dependencies for symbols defined in the JITLink graph.
     for (auto *Sym : G.defined_symbols()) {
@@ -395,58 +465,41 @@ class ObjectLinkingLayerJITLinkContext final : public JITLinkContext {
       assert(Sym->hasName() &&
              "Defined non-local jitlink::Symbol should have a name");
 
-      SymbolNameSet ExternalSymDeps, InternalSymDeps;
-
-      // Find internal and external named symbol dependencies.
-      for (auto &E : Sym->getBlock().edges()) {
-        auto &TargetSym = E.getTarget();
-
-        if (TargetSym.getScope() != Scope::Local) {
-          if (TargetSym.isExternal())
-            ExternalSymDeps.insert(ES.intern(TargetSym.getName()));
-          else if (&TargetSym != Sym)
-            InternalSymDeps.insert(ES.intern(TargetSym.getName()));
-        } else {
-          assert(TargetSym.isDefined() &&
-                 "local symbols must be defined");
-          auto I = LocalDeps.find(&TargetSym);
-          if (I != LocalDeps.end()) {
-            for (auto &S : I->second.External)
-              ExternalSymDeps.insert(S);
-            for (auto &S : I->second.Internal)
-              InternalSymDeps.insert(S);
-          }
-        }
-      }
-
-      if (ExternalSymDeps.empty() && InternalSymDeps.empty())
+      auto &SymDeps = BlockDeps[Sym->getBlock()];
+      if (SymDeps.External.empty() && SymDeps.Internal.empty())
         continue;
 
       auto SymName = ES.intern(Sym->getName());
-      if (!ExternalSymDeps.empty())
-        ExternalNamedSymbolDeps[SymName] = std::move(ExternalSymDeps);
-      if (!InternalSymDeps.empty())
-        InternalNamedSymbolDeps[SymName] = std::move(InternalSymDeps);
+      if (!SymDeps.External.empty())
+        ExternalNamedSymbolDeps[SymName] = SymDeps.External;
+      if (!SymDeps.Internal.empty())
+        InternalNamedSymbolDeps[SymName] = SymDeps.Internal;
     }
 
     for (auto &P : Layer.Plugins) {
-      auto SyntheticLocalDeps = P->getSyntheticSymbolLocalDependencies(*MR);
-      if (SyntheticLocalDeps.empty())
+      auto SynthDeps = P->getSyntheticSymbolDependencies(*MR);
+      if (SynthDeps.empty())
         continue;
 
-      for (auto &KV : SyntheticLocalDeps) {
+      DenseSet<Block *> BlockVisited;
+      for (auto &KV : SynthDeps) {
         auto &Name = KV.first;
-        auto &LocalDepsForName = KV.second;
-        for (auto *Local : LocalDepsForName) {
-          assert(Local->getScope() == Scope::Local &&
-                 "Dependence on non-local symbol");
-          auto LocalNamedDepsItr = LocalDeps.find(Local);
-          if (LocalNamedDepsItr == LocalDeps.end())
-            continue;
-          for (auto &S : LocalNamedDepsItr->second.Internal)
-            InternalNamedSymbolDeps[Name].insert(S);
-          for (auto &S : LocalNamedDepsItr->second.External)
-            ExternalNamedSymbolDeps[Name].insert(S);
+        auto &DepsForName = KV.second;
+        for (auto *Sym : DepsForName) {
+          if (Sym->getScope() == Scope::Local) {
+            auto &BDeps = BlockDeps[Sym->getBlock()];
+            for (auto &S : BDeps.Internal)
+              InternalNamedSymbolDeps[Name].insert(S);
+            for (auto &S : BDeps.External)
+              ExternalNamedSymbolDeps[Name].insert(S);
+          } else {
+            if (Sym->isExternal())
+              ExternalNamedSymbolDeps[Name].insert(
+                  BlockDeps.getInternedName(*Sym));
+            else
+              InternalNamedSymbolDeps[Name].insert(
+                  BlockDeps.getInternedName(*Sym));
+          }
         }
       }
     }
@@ -454,81 +507,69 @@ class ObjectLinkingLayerJITLinkContext final : public JITLinkContext {
     return Error::success();
   }
 
-  LocalSymbolNamedDependenciesMap computeLocalDeps(LinkGraph &G) {
-    DenseMap<jitlink::Symbol *, DenseSet<jitlink::Symbol *>> DepMap;
-
-    // For all local symbols:
-    // (1) Add their named dependencies.
-    // (2) Add them to the worklist for further iteration if they have any
-    //     depend on any other local symbols.
-    struct WorklistEntry {
-      WorklistEntry(Symbol *Sym, DenseSet<Symbol *> LocalDeps)
-          : Sym(Sym), LocalDeps(std::move(LocalDeps)) {}
-
-      Symbol *Sym = nullptr;
-      DenseSet<Symbol *> LocalDeps;
+  BlockDependenciesMap computeBlockNonLocalDeps(LinkGraph &G) {
+    // First calculate the reachable-via-non-local-symbol blocks for each block.
+    struct BlockInfo {
+      DenseSet<Block *> Dependencies;
+      DenseSet<Block *> Dependants;
+      bool DependenciesChanged = true;
     };
-    std::vector<WorklistEntry> Worklist;
-    for (auto *Sym : G.defined_symbols())
-      if (Sym->getScope() == Scope::Local) {
-        auto &SymNamedDeps = DepMap[Sym];
-        DenseSet<Symbol *> LocalDeps;
-
-        for (auto &E : Sym->getBlock().edges()) {
-          auto &TargetSym = E.getTarget();
-          if (TargetSym.getScope() != Scope::Local)
-            SymNamedDeps.insert(&TargetSym);
-          else {
-            assert(TargetSym.isDefined() &&
-                   "local symbols must be defined");
-            LocalDeps.insert(&TargetSym);
+    DenseMap<Block *, BlockInfo> BlockInfos;
+    SmallVector<Block *> WorkList;
+
+    // Pre-allocate map entries. This prevents any iterator/reference
+    // invalidation in the next loop.
+    for (auto *B : G.blocks())
+      (void)BlockInfos[B];
+
+    // Build initial worklist, record block dependencies/dependants and
+    // non-local symbol dependencies.
+    for (auto *B : G.blocks()) {
+      auto &BI = BlockInfos[B];
+      for (auto &E : B->edges()) {
+        if (E.getTarget().getScope() == Scope::Local) {
+          auto &TgtB = E.getTarget().getBlock();
+          if (&TgtB != B) {
+            BI.Dependencies.insert(&TgtB);
+            BlockInfos[&TgtB].Dependants.insert(B);
           }
         }
-
-        if (!LocalDeps.empty())
-          Worklist.push_back(WorklistEntry(Sym, std::move(LocalDeps)));
       }
 
-    // Loop over all local symbols with local dependencies, propagating
-    // their respective non-local dependencies. Iterate until we hit a stable
-    // state.
-    bool Changed;
-    do {
-      Changed = false;
-      for (auto &WLEntry : Worklist) {
-        auto *Sym = WLEntry.Sym;
-        auto &NamedDeps = DepMap[Sym];
-        auto &LocalDeps = WLEntry.LocalDeps;
-
-        for (auto *TargetSym : LocalDeps) {
-          auto I = DepMap.find(TargetSym);
-          if (I != DepMap.end())
-            for (const auto &S : I->second)
-              Changed |= NamedDeps.insert(S).second;
-        }
-      }
-    } while (Changed);
+      // If this node has both dependants and dependencies then add it to the
+      // worklist to propagate the dependencies to the dependants.
+      if (!BI.Dependants.empty() && !BI.Dependencies.empty())
+        WorkList.push_back(B);
+    }
 
-    // Intern the results to produce a mapping of jitlink::Symbol* to internal
-    // and external symbol names.
-    auto &ES = Layer.getExecutionSession();
-    LocalSymbolNamedDependenciesMap Result;
-    for (auto &KV : DepMap) {
-      auto *Local = KV.first;
-      assert(Local->getScope() == Scope::Local &&
-             "DepMap keys should all be local symbols");
-      auto &LocalNamedDeps = Result[Local];
-      for (auto *Named : KV.second) {
-        assert(Named->getScope() != Scope::Local &&
-               "DepMap values should all be non-local symbol sets");
-        if (Named->isExternal())
-          LocalNamedDeps.External.insert(ES.intern(Named->getName()));
-        else
-          LocalNamedDeps.Internal.insert(ES.intern(Named->getName()));
+    // Propagate block-level dependencies through the block-dependence graph.
+    while (!WorkList.empty()) {
+      auto *B = WorkList.back();
+      WorkList.pop_back();
+
+      auto &BI = BlockInfos[B];
+      assert(BI.DependenciesChanged &&
+             "Block in worklist has unchanged dependencies");
+      BI.DependenciesChanged = false;
+      for (auto *Dependant : BI.Dependants) {
+        auto &DependantBI = BlockInfos[Dependant];
+        for (auto *Dependency : BI.Dependencies) {
+          if (Dependant != Dependency &&
+              DependantBI.Dependencies.insert(Dependency).second)
+            if (!DependantBI.DependenciesChanged) {
+              DependantBI.DependenciesChanged = true;
+              WorkList.push_back(Dependant);
+            }
+        }
       }
     }
 
-    return Result;
+    DenseMap<const Block *, DenseSet<Block *>> BlockDeps;
+    for (auto &KV : BlockInfos)
+      BlockDeps[KV.first] = std::move(KV.second.Dependencies);
+
+    return BlockDependenciesMap(Layer.getExecutionSession(),
+                                std::move(BlockDeps));
   }
 
   void registerDependencies(const SymbolDependenceMap &QueryDeps) {


        


More information about the llvm-commits mailing list