[llvm] r264902 - [PGO] PGOFuncName in LTO optimizations

Rong Xu via llvm-commits llvm-commits at lists.llvm.org
Wed Mar 30 13:26:46 PDT 2016


On Wed, Mar 30, 2016 at 1:01 PM, Justin Bogner <mail at justinbogner.com>
wrote:

> Rong Xu via llvm-commits <llvm-commits at lists.llvm.org> writes:
> > Author: xur
> > Date: Wed Mar 30 13:37:52 2016
> > New Revision: 264902
> >
> > URL: http://llvm.org/viewvc/llvm-project?rev=264902&view=rev
> > Log:
> > [PGO] PGOFuncName in LTO optimizations
> >
> > PGOFuncNames are used as the key to retrieve the Function definition
> from the
> > MD5 stored in the profile. For internal linkage function, we prefix the
> source
> > file name to the PGOFuncNames. LTO's internalization privatizes many
> > global linkage
> > symbols. This happens after value profile annotation, but those internal
> > linkage functions should not have a source prefix. To differentiate
> compiler
> > generated internal symbols from original ones, PGOFuncName meta data are
> > created and attached to the original internal symbols in the value
> profile
> > annotation step. If a symbol does not have the meta data, its original
> linkage
> > must be non-internal.
> >
> > Also add a new map that maps PGOFuncName's MD5 value to the function
> definition.
> >
> > Differential Revision: http://reviews.llvm.org/D17895
>
> This review didn't show up on the LLVM list at all. Please make sure
> that code is reviewed on list in the future.
>

Sorry. This is my fault. I thought I had add llvm-commits to the
subscriber, but obviously I did not.


> This might be okay for now, but the approach seems questionable. It's
> pretty awkward to have to teach profiling about LTO, and I feel like the
> problems with how we munge internal names come up in other contexts
> too.


> Longer term, I think that the right thing to do in terms of dealing with
> internal functions is to avoid munging their names at all. To do this,
> we'd need to add something to the function hash or as a new field to
> avoid collisions between unrelated local names, but such an approach
> would clean up the whole class of issues with how we rename these
> symbols.
>

I agree there are better ways to handle the conflict internal names than
current source prefixing method.
But I think it's gonna hard to avoid the avoid the interaction with LTO (I
mean the internalization). Note that the profile-generation pass is pre
LTO. The compiler has to have different handling of originally internal
symbols and later generated internal symbols.

>
> > Modified:
> >     llvm/trunk/include/llvm/ProfileData/InstrProf.h
> >     llvm/trunk/lib/ProfileData/InstrProf.cpp
> >     llvm/trunk/lib/Transforms/Instrumentation/PGOInstrumentation.cpp
> >
> > Modified: llvm/trunk/include/llvm/ProfileData/InstrProf.h
> > URL:
> >
> http://llvm.org/viewvc/llvm-project/llvm/trunk/include/llvm/ProfileData/InstrProf.h?rev=264902&r1=264901&r2=264902&view=diff
> >
> ==============================================================================
> > --- llvm/trunk/include/llvm/ProfileData/InstrProf.h (original)
> > +++ llvm/trunk/include/llvm/ProfileData/InstrProf.h Wed Mar 30 13:37:52
> 2016
> > @@ -155,8 +155,9 @@ inline StringRef getInstrProfFileOverrid
> >  inline StringRef getInstrProfNameSeparator() { return "\01"; }
> >
> >  /// Return the modified name for function \c F suitable to be
> > -/// used the key for profile lookup.
> > -std::string getPGOFuncName(const Function &F,
> > +/// used the key for profile lookup. Variable \c InLTO indicates if this
> > +/// is called in LTO optimization passes.
> > +std::string getPGOFuncName(const Function &F, bool InLTO = false,
> >                             uint64_t Version = INSTR_PROF_INDEX_VERSION);
> >
> >  /// Return the modified name for a function suitable to be
> > @@ -302,13 +303,17 @@ private:
> >    StringSet<> NameTab;
> >    // A map from MD5 keys to function name strings.
> >    std::vector<std::pair<uint64_t, StringRef>> MD5NameMap;
> > +  // A map from MD5 keys to function define. We only populate this map
> > +  // when build the Symtab from a Module.
> > +  std::vector<std::pair<uint64_t, Function *>> MD5FuncMap;
> >    // A map from function runtime address to function name MD5 hash.
> >    // This map is only populated and used by raw instr profile reader.
> >    AddrHashMap AddrToMD5Map;
> >
> >  public:
> >    InstrProfSymtab()
> > -      : Data(), Address(0), NameTab(), MD5NameMap(), AddrToMD5Map() {}
> > +      : Data(), Address(0), NameTab(), MD5NameMap(), MD5FuncMap(),
> > +      AddrToMD5Map() {}
> >
> >    /// Create InstrProfSymtab from an object file section which
> >    /// contains function PGO names. When section may contain raw
> > @@ -326,8 +331,9 @@ public:
> >    inline std::error_code create(StringRef NameStrings);
> >    /// A wrapper interface to populate the PGO symtab with functions
> >    /// decls from module \c M. This interface is used by transformation
> > -  /// passes such as indirect function call promotion.
> > -  void create(const Module &M);
> > +  /// passes such as indirect function call promotion. Variable \c InLTO
> > +  /// indicates if this is called from LTO optimization passes.
> > +  void create(Module &M, bool InLTO = false);
> >    /// Create InstrProfSymtab from a set of names iteratable from
> >    /// \p IterRange. This interface is used by IndexedProfReader.
> >    template <typename NameIterRange> void create(const NameIterRange
> &IterRange);
> > @@ -357,6 +363,8 @@ public:
> >    /// Return function's PGO name from the name's md5 hash value.
> >    /// If not found, return an empty string.
> >    inline StringRef getFuncName(uint64_t FuncMD5Hash);
> > +  /// Return function from the name's md5 hash. Return nullptr if not
> found.
> > +  inline Function *getFunction(uint64_t FuncMD5Hash);
> >    /// Return the function's original assembly name by stripping off
> >    /// the prefix attached (to symbols with priviate linkage). For
> >    /// global functions, it returns the same string as getFuncName.
> > @@ -387,6 +395,7 @@ void InstrProfSymtab::create(const NameI
> >
> >  void InstrProfSymtab::finalizeSymtab() {
> >    std::sort(MD5NameMap.begin(), MD5NameMap.end(), less_first());
> > +  std::sort(MD5FuncMap.begin(), MD5FuncMap.end(), less_first());
> >    std::sort(AddrToMD5Map.begin(), AddrToMD5Map.end(), less_first());
> >    AddrToMD5Map.erase(std::unique(AddrToMD5Map.begin(),
> AddrToMD5Map.end()),
> >                       AddrToMD5Map.end());
> > @@ -402,6 +411,16 @@ StringRef InstrProfSymtab::getFuncName(u
> >    return StringRef();
> >  }
> >
> > +Function* InstrProfSymtab::getFunction(uint64_t FuncMD5Hash) {
> > +  auto Result =
> > +      std::lower_bound(MD5FuncMap.begin(), MD5FuncMap.end(),
> FuncMD5Hash,
> > +                       [](const std::pair<uint64_t, Function*> &LHS,
> > +                          uint64_t RHS) { return LHS.first < RHS; });
> > +  if (Result != MD5FuncMap.end() && Result->first == FuncMD5Hash)
> > +    return Result->second;
> > +  return nullptr;
> > +}
> > +
> >  // See also getPGOFuncName implementation. These two need to be
> >  // matched.
> >  StringRef InstrProfSymtab::getOrigFuncName(uint64_t FuncMD5Hash) {
> >
> > Modified: llvm/trunk/lib/ProfileData/InstrProf.cpp
> > URL:
> http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/ProfileData/InstrProf.cpp?rev=264902&r1=264901&r2=264902&view=diff
> >
> ==============================================================================
> > --- llvm/trunk/lib/ProfileData/InstrProf.cpp (original)
> > +++ llvm/trunk/lib/ProfileData/InstrProf.cpp Wed Mar 30 13:37:52 2016
> > @@ -83,9 +83,32 @@ std::string getPGOFuncName(StringRef Raw
> >    return GlobalValue::getGlobalIdentifier(RawFuncName, Linkage,
> FileName);
> >  }
> >
> > -std::string getPGOFuncName(const Function &F, uint64_t Version) {
> > -  return getPGOFuncName(F.getName(), F.getLinkage(),
> F.getParent()->getName(),
> > -                        Version);
> > +// Return the PGOFuncName. This function has some special handling when
> called
> > +// in LTO optimization. The following only applies when calling in LTO
> passes
> > +// (when \c InLTO is true): LTO's internalization privatizes many
> global linkage
> > +// symbols. This happens after value profile annotation, but those
> internal
> > +// linkage functions should not have a source prefix.
> > +// To differentiate compiler generated internal symbols from original
> ones,
> > +// PGOFuncName meta data are created and attached to the original
> internal
> > +// symbols in the value profile annotation step
> > +// (PGOUseFunc::annotateIndirectCallSites). If a symbol does not have
> the meta
> > +// data, its original linkage must be non-internal.
> > +std::string getPGOFuncName(const Function &F, bool InLTO, uint64_t
> Version) {
> > +  if (!InLTO)
> > +    return getPGOFuncName(F.getName(), F.getLinkage(),
> F.getParent()->getName(),
> > +                          Version);
> > +
> > +  // InLTO mode. First check if these is a meta data.
> > +  MDNode *MD = F.getMetadata("PGOFuncName");
> > +  if (MD != nullptr) {
> > +    StringRef S = cast<MDString>(MD->getOperand(0))->getString();
> > +    return S.str();
> > +  }
> > +
> > +  // If there is no meta data, the function must be a global before the
> value
> > +  // profile annotation pass. Its current linkage may be internal if it
> is
> > +  // internalized in LTO mode.
> > +  return getPGOFuncName (F.getName(), GlobalValue::ExternalLinkage, "");
> >  }
> >
> >  StringRef getFuncNameWithoutPrefix(StringRef PGOFuncName, StringRef
> FileName) {
> > @@ -149,9 +172,16 @@ GlobalVariable *createPGOFuncNameVar(Fun
> >    return createPGOFuncNameVar(*F.getParent(), F.getLinkage(),
> PGOFuncName);
> >  }
> >
> > -void InstrProfSymtab::create(const Module &M) {
> > -  for (const Function &F : M)
> > -    addFuncName(getPGOFuncName(F));
> > +void InstrProfSymtab::create(Module &M, bool InLTO) {
> > +  for (Function &F : M) {
> > +    // Function may not have a name: like using asm("") to overwrite
> the name.
> > +    // Ignore in this case.
> > +    if (!F.hasName())
> > +      continue;
> > +    const std::string &PGOFuncName = getPGOFuncName(F, InLTO);
> > +    addFuncName(PGOFuncName);
> > +    MD5FuncMap.push_back(std::make_pair(Function::getGUID(PGOFuncName),
> &F));
> > +  }
> >
> >    finalizeSymtab();
> >  }
> >
> > Modified:
> llvm/trunk/lib/Transforms/Instrumentation/PGOInstrumentation.cpp
> > URL:
> http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Transforms/Instrumentation/PGOInstrumentation.cpp?rev=264902&r1=264901&r2=264902&view=diff
> >
> ==============================================================================
> > --- llvm/trunk/lib/Transforms/Instrumentation/PGOInstrumentation.cpp
> (original)
> > +++ llvm/trunk/lib/Transforms/Instrumentation/PGOInstrumentation.cpp Wed
> Mar 30 13:37:52 2016
> > @@ -750,6 +750,15 @@ void PGOUseFunc::annotateIndirectCallSit
> >    if (DisableValueProfiling)
> >      return;
> >
> > +  // Write out the PGOFuncName if this is different from it's raw name.
> > +  // This should only apply to internal linkage functions only.
> > +  const std::string &FuncName = getPGOFuncName(F);
> > +  if (FuncName != F.getName()) {
> > +    LLVMContext &C = F.getContext();
> > +    MDNode *N = MDNode::get(C, MDString::get(C, FuncName.c_str()));
> > +    F.setMetadata("PGOFuncName", N);
> > +  }
> > +
> >    unsigned IndirectCallSiteIndex = 0;
> >    PGOIndirectCallSiteVisitor ICV;
> >    ICV.visit(F);
> >
> >
> > _______________________________________________
> > llvm-commits mailing list
> > llvm-commits at lists.llvm.org
> > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-commits
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20160330/9408eca1/attachment.html>


More information about the llvm-commits mailing list