[llvm] r255100 - The current importing scheme is processing one function at a time,

Mehdi Amini via llvm-commits llvm-commits at lists.llvm.org
Wed Dec 9 12:45:44 PST 2015


Thanks for the fix.

Interestingly it didn’t break ASAN on OS X.

— 
Mehdi

> On Dec 9, 2015, at 12:44 PM, Rafael Espíndola <rafael.espindola at gmail.com> wrote:
> 
> Should be fixed with r255149. We hit that with a windows debug build too.
> 
> Cheers,
> Rafael
> 
> 
> On 9 December 2015 at 15:10, Alexey Samsonov via llvm-commits
> <llvm-commits at lists.llvm.org> wrote:
>> 
>> On Wed, Dec 9, 2015 at 12:17 AM, Mehdi Amini via llvm-commits
>> <llvm-commits at lists.llvm.org> wrote:
>>> 
>>> Author: mehdi_amini
>>> Date: Wed Dec  9 02:17:35 2015
>>> New Revision: 255100
>>> 
>>> URL: http://llvm.org/viewvc/llvm-project?rev=255100&view=rev
>>> Log:
>>> The current importing scheme is processing one function at a time,
>>> loading the source Module, linking the function in the destination
>>> module, and destroying the source Module before repeating with the
>>> next function to import (potentially from the same Module).
>>> 
>>> Ideally we would keep the source Module alive and import the next
>>> Function needed from this Module. Unfortunately this is not possible
>>> because the linker does not leave it in a usable state.
>>> 
>>> However we can do better by first computing the list of all candidates
>>> per Module, and only then load the source Module and import all the
>>> function we need for it.
>>> 
>>> The trick to process callees is to materialize function in the source
>>> module when building the list of function to import, and inspect them
>>> in their source module, collecting the list of callees for each
>>> callee.
>>> 
>>> When we move the the actual import, we will import from each source
>>> module exactly once. Each source module is loaded exactly once.
>>> The only drawback it that it requires to have all the lazy-loaded
>>> source Module in memory at the same time.
>>> 
>>> Currently this patch already improves considerably the link time,
>>> a multithreaded link of llvm-dis on my laptop was:
>>> 
>>>  real  1m12.175s  user  6m32.430s sys  0m10.529s
>>> 
>>> and is now:
>>> 
>>>  real  0m40.697s  user  2m10.237s sys  0m4.375s
>>> 
>>> Note: this is the full link time (linker+Import+Optimizer+CodeGen)
>>> 
>>> Differential Revision: http://reviews.llvm.org/D15178
>>> 
>>> From: Mehdi Amini <mehdi.amini at apple.com>
>>> 
>>> Modified:
>>>    llvm/trunk/lib/Transforms/IPO/FunctionImport.cpp
>>> 
>>> Modified: llvm/trunk/lib/Transforms/IPO/FunctionImport.cpp
>>> URL:
>>> http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Transforms/IPO/FunctionImport.cpp?rev=255100&r1=255099&r2=255100&view=diff
>>> 
>>> ==============================================================================
>>> --- llvm/trunk/lib/Transforms/IPO/FunctionImport.cpp (original)
>>> +++ llvm/trunk/lib/Transforms/IPO/FunctionImport.cpp Wed Dec  9 02:17:35
>>> 2015
>>> @@ -24,6 +24,9 @@
>>> #include "llvm/Support/CommandLine.h"
>>> #include "llvm/Support/Debug.h"
>>> #include "llvm/Support/SourceMgr.h"
>>> +
>>> +#include <map>
>>> +
>>> using namespace llvm;
>>> 
>>> #define DEBUG_TYPE "function-import"
>>> @@ -50,37 +53,108 @@ static std::unique_ptr<Module> loadFile(
>>>   return Result;
>>> }
>>> 
>>> +namespace {
>>> +/// Helper to load on demand a Module from file and cache it for
>>> subsequent
>>> +/// queries. It can be used with the FunctionImporter.
>>> +class ModuleLazyLoaderCache {
>>> +  /// Cache of lazily loaded module for import.
>>> +  StringMap<std::unique_ptr<Module>> ModuleMap;
>>> +
>>> +  /// Retrieve a Module from the cache or lazily load it on demand.
>>> +  std::function<std::unique_ptr<Module>(StringRef FileName)>
>>> createLazyModule;
>>> +
>>> +public:
>>> +  /// Create the loader, Module will be initialized in \p Context.
>>> +  ModuleLazyLoaderCache(std::function<
>>> +      std::unique_ptr<Module>(StringRef FileName)> createLazyModule)
>>> +      : createLazyModule(createLazyModule) {}
>>> +
>>> +  /// Retrieve a Module from the cache or lazily load it on demand.
>>> +  Module &operator()(StringRef FileName);
>>> +};
>>> +
>>> +// Get a Module for \p FileName from the cache, or load it lazily.
>>> +Module &ModuleLazyLoaderCache::operator()(StringRef Identifier) {
>>> +  auto &Module = ModuleMap[Identifier];
>>> +  if (!Module)
>>> +    Module = createLazyModule(Identifier);
>>> +  return *Module;
>>> +}
>>> +} // anonymous namespace
>>> +
>>> /// Walk through the instructions in \p F looking for external
>>> /// calls not already in the \p CalledFunctions set. If any are
>>> /// found they are added to the \p Worklist for importing.
>>> -static void findExternalCalls(const Function &F, StringSet<>
>>> &CalledFunctions,
>>> +static void findExternalCalls(const Module &DestModule, Function &F,
>>> +                              const FunctionInfoIndex &Index,
>>> +                              StringSet<> &CalledFunctions,
>>>                               SmallVector<StringRef, 64> &Worklist) {
>>> +  // We need to suffix internal function calls imported from other
>>> modules,
>>> +  // prepare the suffix ahead of time.
>>> +  StringRef Suffix;
>>> +  if (F.getParent() != &DestModule)
>>> +    Suffix =
>>> +        (Twine(".llvm.") +
>>> +
>>> Twine(Index.getModuleId(F.getParent()->getModuleIdentifier()))).str();
>> 
>> 
>> ^^
>> No. Please don't assign temporary string to StringRef. This was detected by
>> one of our ASan bootstrap bots.
>> 
>>> 
>>> +
>>>   for (auto &BB : F) {
>>>     for (auto &I : BB) {
>>>       if (isa<CallInst>(I)) {
>>>         auto CalledFunction = cast<CallInst>(I).getCalledFunction();
>>>         // Insert any new external calls that have not already been
>>>         // added to set/worklist.
>>> -        if (CalledFunction && CalledFunction->hasName() &&
>>> -            CalledFunction->isDeclaration() &&
>>> -            !CalledFunctions.count(CalledFunction->getName())) {
>>> -          CalledFunctions.insert(CalledFunction->getName());
>>> -          Worklist.push_back(CalledFunction->getName());
>>> +        if (!CalledFunction || !CalledFunction->hasName())
>>> +          continue;
>>> +        // Ignore intrinsics early
>>> +        if (CalledFunction->isIntrinsic()) {
>>> +          assert(CalledFunction->getIntrinsicID() != 0);
>>> +          continue;
>>> +        }
>>> +        auto ImportedName = CalledFunction->getName();
>>> +        auto Renamed = (ImportedName + Suffix).str();
>>> +        // Rename internal functions
>>> +        if (CalledFunction->hasInternalLinkage()) {
>>> +          ImportedName = Renamed;
>>> +        }
>>> +        auto It = CalledFunctions.insert(ImportedName);
>>> +        if (!It.second) {
>>> +          // This is a call to a function we already considered, skip.
>>> +          continue;
>>>         }
>>> +        // Ignore functions already present in the destination module
>>> +        auto *SrcGV = DestModule.getNamedValue(ImportedName);
>>> +        if (SrcGV) {
>>> +          assert(isa<Function>(SrcGV) && "Name collision during import");
>>> +          if (!cast<Function>(SrcGV)->isDeclaration()) {
>>> +            DEBUG(dbgs() << DestModule.getModuleIdentifier() << "Ignoring
>>> "
>>> +                         << ImportedName << " already in
>>> DestinationModule\n");
>>> +            continue;
>>> +          }
>>> +        }
>>> +
>>> +        Worklist.push_back(It.first->getKey());
>>> +        DEBUG(dbgs() << DestModule.getModuleIdentifier()
>>> +                     << " Adding callee for : " << ImportedName << " : "
>>> +                     << F.getName() << "\n");
>>>       }
>>>     }
>>>   }
>>> }
>>> 
>>> // Helper function: given a worklist and an index, will process all the
>>> worklist
>>> -// and import them based on the summary information
>>> -static unsigned ProcessImportWorklist(
>>> +// and decide what to import based on the summary information.
>>> +//
>>> +// Nothing is actually imported, functions are materialized in their
>>> source
>>> +// module and analyzed there.
>>> +//
>>> +// \p ModuleToFunctionsToImportMap is filled with the set of Function to
>>> import
>>> +// per Module.
>>> +static void GetImportList(
>>>     Module &DestModule, SmallVector<StringRef, 64> &Worklist,
>>> -    StringSet<> &CalledFunctions, Linker &TheLinker,
>>> -    const FunctionInfoIndex &Index,
>>> -    std::function<std::unique_ptr<Module>(StringRef FileName)> &
>>> -        LazyModuleLoader) {
>>> -  unsigned ImportCount = 0;
>>> +    StringSet<> &CalledFunctions,
>>> +    std::map<StringRef, std::pair<Module *, DenseSet<const GlobalValue
>>> *>>> &
>>> +        ModuleToFunctionsToImportMap,
>>> +    const FunctionInfoIndex &Index, ModuleLazyLoaderCache
>>> &ModuleLoaderCache) {
>>>   while (!Worklist.empty()) {
>>>     auto CalledFunctionName = Worklist.pop_back_val();
>>>     DEBUG(dbgs() << DestModule.getModuleIdentifier() << "Process import
>>> for "
>>> @@ -116,40 +190,39 @@ static unsigned ProcessImportWorklist(
>>>     }
>>> 
>>>     // Get the module path from the summary.
>>> -    auto FileName = Summary->modulePath();
>>> -    DEBUG(dbgs() << "Importing " << CalledFunctionName << " from " <<
>>> FileName
>>> -                 << "\n");
>>> +    auto ModuleIdentifier = Summary->modulePath();
>>> +    DEBUG(dbgs() << DestModule.getModuleIdentifier() << " Importing "
>>> +                 << CalledFunctionName << " from " << ModuleIdentifier <<
>>> "\n");
>>> 
>>> -    // Get the module for the import
>>> -    auto SrcModule = LazyModuleLoader(FileName);
>>> -    assert(&SrcModule->getContext() == &DestModule.getContext());
>>> +    auto &SrcModule = ModuleLoaderCache(ModuleIdentifier);
>>> 
>>>     // The function that we will import!
>>> -    GlobalValue *SGV = SrcModule->getNamedValue(CalledFunctionName);
>>> -    StringRef ImportFunctionName = CalledFunctionName;
>>> +    GlobalValue *SGV = SrcModule.getNamedValue(CalledFunctionName);
>>> +
>>>     if (!SGV) {
>>> -      // Might be local in source Module, promoted/renamed in DestModule.
>>> +      // The destination module is referencing function using their
>>> renamed name
>>> +      // when importing a function that was originally local in the
>>> source
>>> +      // module. The source module we have might not have been renamed so
>>> we try
>>> +      // to remove the suffix added during the renaming to recover the
>>> original
>>> +      // name in the source module.
>>>       std::pair<StringRef, StringRef> Split =
>>>           CalledFunctionName.split(".llvm.");
>>> -      SGV = SrcModule->getNamedValue(Split.first);
>>> -#ifndef NDEBUG
>>> -      // Assert that Split.second is module id
>>> -      uint64_t ModuleId;
>>> -      assert(!Split.second.getAsInteger(10, ModuleId));
>>> -      assert(ModuleId == Index.getModuleId(FileName));
>>> -#endif
>>> +      SGV = SrcModule.getNamedValue(Split.first);
>>> +      assert(SGV && "Can't find function to import in source module");
>>> +    }
>>> +    if (!SGV) {
>>> +      report_fatal_error(Twine("Can't load function '") +
>>> CalledFunctionName +
>>> +                         "' in Module '" +
>>> SrcModule.getModuleIdentifier() +
>>> +                         "', error in the summary?\n");
>>>     }
>>> +
>>>     Function *F = dyn_cast<Function>(SGV);
>>>     if (!F && isa<GlobalAlias>(SGV)) {
>>>       auto *SGA = dyn_cast<GlobalAlias>(SGV);
>>>       F = dyn_cast<Function>(SGA->getBaseObject());
>>> -      ImportFunctionName = F->getName();
>>> -    }
>>> -    if (!F) {
>>> -      errs() << "Can't load function '" << CalledFunctionName << "' in
>>> Module '"
>>> -             << FileName << "', error in the summary?\n";
>>> -      llvm_unreachable("Can't load function in Module");
>>> +      CalledFunctionName = F->getName();
>>>     }
>>> +    assert(F && "Imported Function is ... not a Function");
>>> 
>>>     // We cannot import weak_any functions/aliases without possibly
>>> affecting
>>>     // the order they are seen and selected by the linker, changing
>>> program
>>> @@ -158,26 +231,20 @@ static unsigned ProcessImportWorklist(
>>>       DEBUG(dbgs() << DestModule.getModuleIdentifier()
>>>                    << " Ignoring import request for weak-any "
>>>                    << (isa<Function>(SGV) ? "function " : "alias ")
>>> -                   << CalledFunctionName << " from " << FileName <<
>>> "\n");
>>> +                   << CalledFunctionName << " from "
>>> +                   << SrcModule.getModuleIdentifier() << "\n");
>>>       continue;
>>>     }
>>> 
>>> -    // Link in the specified function.
>>> -    DenseSet<const GlobalValue *> FunctionsToImport;
>>> -    FunctionsToImport.insert(F);
>>> -    if (TheLinker.linkInModule(*SrcModule, Linker::Flags::None, &Index,
>>> -                               &FunctionsToImport))
>>> -      report_fatal_error("Function Import: link error");
>>> -
>>> -    // Process the newly imported function and add callees to the
>>> worklist.
>>> -    GlobalValue *NewGV = DestModule.getNamedValue(ImportFunctionName);
>>> -    assert(NewGV);
>>> -    Function *NewF = dyn_cast<Function>(NewGV);
>>> -    assert(NewF);
>>> -    findExternalCalls(*NewF, CalledFunctions, Worklist);
>>> -    ++ImportCount;
>>> +    // Add the function to the import list
>>> +    auto &Entry =
>>> ModuleToFunctionsToImportMap[SrcModule.getModuleIdentifier()];
>>> +    Entry.first = &SrcModule;
>>> +    Entry.second.insert(F);
>>> +
>>> +    // Process the newly imported functions and add callees to the
>>> worklist.
>>> +    F->materialize();
>>> +    findExternalCalls(DestModule, *F, Index, CalledFunctions, Worklist);
>>>   }
>>> -  return ImportCount;
>>> }
>>> 
>>> // Automatically import functions in Module \p DestModule based on the
>>> summaries
>>> @@ -196,7 +263,7 @@ bool FunctionImporter::importFunctions(M
>>>   for (auto &F : DestModule) {
>>>     if (F.isDeclaration() || F.hasFnAttribute(Attribute::OptimizeNone))
>>>       continue;
>>> -    findExternalCalls(F, CalledFunctions, Worklist);
>>> +    findExternalCalls(DestModule, F, Index, CalledFunctions, Worklist);
>>>   }
>>>   if (Worklist.empty())
>>>     return false;
>>> @@ -206,10 +273,33 @@ bool FunctionImporter::importFunctions(M
>>>   // Linker that will be used for importing function
>>>   Linker TheLinker(DestModule, DiagnosticHandler);
>>> 
>>> -  ImportedCount += ProcessImportWorklist(DestModule, Worklist,
>>> CalledFunctions,
>>> -                                         TheLinker, Index, ModuleLoader);
>>> +  // Map of Module -> List of Function to import from the Module
>>> +  std::map<StringRef, std::pair<Module *, DenseSet<const GlobalValue *>>>
>>> +      ModuleToFunctionsToImportMap;
>>> +
>>> +  // Analyze the summaries and get the list of functions to import by
>>> +  // populating ModuleToFunctionsToImportMap
>>> +  ModuleLazyLoaderCache ModuleLoaderCache(ModuleLoader);
>>> +  GetImportList(DestModule, Worklist, CalledFunctions,
>>> +                ModuleToFunctionsToImportMap, Index, ModuleLoaderCache);
>>> +  assert(Worklist.empty() && "Worklist hasn't been flushed in
>>> GetImportList");
>>> 
>>> -  DEBUG(errs() << "Imported " << ImportedCount << " functions for Module
>>> "
>>> +  // Do the actual import of functions now, one Module at a time
>>> +  for (auto &FunctionsToImportPerModule : ModuleToFunctionsToImportMap) {
>>> +    // Get the module for the import
>>> +    auto &FunctionsToImport = FunctionsToImportPerModule.second.second;
>>> +    auto *SrcModule = FunctionsToImportPerModule.second.first;
>>> +    assert(&DestModule.getContext() == &SrcModule->getContext() &&
>>> +           "Context mismatch");
>>> +
>>> +    // Link in the specified functions.
>>> +    if (TheLinker.linkInModule(*SrcModule, Linker::Flags::None, &Index,
>>> +                               &FunctionsToImport))
>>> +      report_fatal_error("Function Import: link error");
>>> +
>>> +    ImportedCount += FunctionsToImport.size();
>>> +  }
>>> +  DEBUG(dbgs() << "Imported " << ImportedCount << " functions for Module
>>> "
>>>                << DestModule.getModuleIdentifier() << "\n");
>>>   return ImportedCount;
>>> }
>>> 
>>> 
>>> _______________________________________________
>>> llvm-commits mailing list
>>> llvm-commits at lists.llvm.org
>>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-commits
>> 
>> 
>> 
>> 
>> --
>> Alexey Samsonov
>> vonosmas at gmail.com
>> 
>> _______________________________________________
>> llvm-commits mailing list
>> llvm-commits at lists.llvm.org
>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-commits
>> 



More information about the llvm-commits mailing list