[llvm] r264902 - [PGO] PGOFuncName in LTO optimizations

Vedant Kumar via llvm-commits llvm-commits at lists.llvm.org
Wed Mar 30 13:21:22 PDT 2016


Some inline comments and nits --

> On Mar 30, 2016, at 1:01 PM, Justin Bogner via llvm-commits <llvm-commits at lists.llvm.org> wrote:
> 
> Rong Xu via llvm-commits <llvm-commits at lists.llvm.org> writes:
>> Author: xur
>> Date: Wed Mar 30 13:37:52 2016
>> New Revision: 264902
>> 
>> URL: http://llvm.org/viewvc/llvm-project?rev=264902&view=rev
>> Log:
>> [PGO] PGOFuncName in LTO optimizations
>> 
>> PGOFuncNames are used as the key to retrieve the Function definition from the
>> MD5 stored in the profile. For internal linkage function, we prefix the source
>> file name to the PGOFuncNames. LTO's internalization privatizes many
>> global linkage
>> symbols. This happens after value profile annotation, but those internal
>> linkage functions should not have a source prefix. To differentiate compiler
>> generated internal symbols from original ones, PGOFuncName meta data are
>> created and attached to the original internal symbols in the value profile
>> annotation step. If a symbol does not have the meta data, its original linkage
>> must be non-internal.
>> 
>> Also add a new map that maps PGOFuncName's MD5 value to the function definition.
>> 
>> Differential Revision: http://reviews.llvm.org/D17895
> 
> This review didn't show up on the LLVM list at all. Please make sure
> that code is reviewed on list in the future.

> This might be okay for now, but the approach seems questionable. It's
> pretty awkward to have to teach profiling about LTO, and I feel like the
> problems with how we munge internal names come up in other contexts
> too.

Right, I'm worried about what changes (if any) are required to frontends
which rely on getPGOFuncName(). @xur, you mentioned some follow-up work you
were planning in clang. Could you elaborate on what's needed there?


> Longer term, I think that the right thing to do in terms of dealing with
> internal functions is to avoid munging their names at all. To do this,
> we'd need to add something to the function hash or as a new field to
> avoid collisions between unrelated local names, but such an approach
> would clean up the whole class of issues with how we rename these
> symbols.
> 
>> Modified:
>>    llvm/trunk/include/llvm/ProfileData/InstrProf.h
>>    llvm/trunk/lib/ProfileData/InstrProf.cpp
>>    llvm/trunk/lib/Transforms/Instrumentation/PGOInstrumentation.cpp
>> 
>> Modified: llvm/trunk/include/llvm/ProfileData/InstrProf.h
>> URL:
>> http://llvm.org/viewvc/llvm-project/llvm/trunk/include/llvm/ProfileData/InstrProf.h?rev=264902&r1=264901&r2=264902&view=diff
>> ==============================================================================
>> --- llvm/trunk/include/llvm/ProfileData/InstrProf.h (original)
>> +++ llvm/trunk/include/llvm/ProfileData/InstrProf.h Wed Mar 30 13:37:52 2016
>> @@ -155,8 +155,9 @@ inline StringRef getInstrProfFileOverrid
>> inline StringRef getInstrProfNameSeparator() { return "\01"; }
>> 
>> /// Return the modified name for function \c F suitable to be
>> -/// used the key for profile lookup.
>> -std::string getPGOFuncName(const Function &F,
>> +/// used the key for profile lookup. Variable \c InLTO indicates if this
>> +/// is called in LTO optimization passes.
>> +std::string getPGOFuncName(const Function &F, bool InLTO = false,
>>                            uint64_t Version = INSTR_PROF_INDEX_VERSION);
>> 
>> /// Return the modified name for a function suitable to be
>> @@ -302,13 +303,17 @@ private:
>>   StringSet<> NameTab;
>>   // A map from MD5 keys to function name strings.
>>   std::vector<std::pair<uint64_t, StringRef>> MD5NameMap;
>> +  // A map from MD5 keys to function define. We only populate this map
>> +  // when build the Symtab from a Module.
>> +  std::vector<std::pair<uint64_t, Function *>> MD5FuncMap;
>>   // A map from function runtime address to function name MD5 hash.
>>   // This map is only populated and used by raw instr profile reader.
>>   AddrHashMap AddrToMD5Map;
>> 
>> public:
>>   InstrProfSymtab()
>> -      : Data(), Address(0), NameTab(), MD5NameMap(), AddrToMD5Map() {}
>> +      : Data(), Address(0), NameTab(), MD5NameMap(), MD5FuncMap(),
>> +      AddrToMD5Map() {}
>> 
>>   /// Create InstrProfSymtab from an object file section which
>>   /// contains function PGO names. When section may contain raw
>> @@ -326,8 +331,9 @@ public:
>>   inline std::error_code create(StringRef NameStrings);
>>   /// A wrapper interface to populate the PGO symtab with functions
>>   /// decls from module \c M. This interface is used by transformation
>> -  /// passes such as indirect function call promotion.
>> -  void create(const Module &M);
>> +  /// passes such as indirect function call promotion. Variable \c InLTO
>> +  /// indicates if this is called from LTO optimization passes.
>> +  void create(Module &M, bool InLTO = false);
>>   /// Create InstrProfSymtab from a set of names iteratable from
>>   /// \p IterRange. This interface is used by IndexedProfReader.
>>   template <typename NameIterRange> void create(const NameIterRange &IterRange);
>> @@ -357,6 +363,8 @@ public:
>>   /// Return function's PGO name from the name's md5 hash value.
>>   /// If not found, return an empty string.
>>   inline StringRef getFuncName(uint64_t FuncMD5Hash);
>> +  /// Return function from the name's md5 hash. Return nullptr if not found.
>> +  inline Function *getFunction(uint64_t FuncMD5Hash);
>>   /// Return the function's original assembly name by stripping off
>>   /// the prefix attached (to symbols with priviate linkage). For
>>   /// global functions, it returns the same string as getFuncName.
>> @@ -387,6 +395,7 @@ void InstrProfSymtab::create(const NameI
>> 
>> void InstrProfSymtab::finalizeSymtab() {
>>   std::sort(MD5NameMap.begin(), MD5NameMap.end(), less_first());
>> +  std::sort(MD5FuncMap.begin(), MD5FuncMap.end(), less_first());
>>   std::sort(AddrToMD5Map.begin(), AddrToMD5Map.end(), less_first());
>>   AddrToMD5Map.erase(std::unique(AddrToMD5Map.begin(), AddrToMD5Map.end()),
>>                      AddrToMD5Map.end());
>> @@ -402,6 +411,16 @@ StringRef InstrProfSymtab::getFuncName(u
>>   return StringRef();
>> }
>> 
>> +Function* InstrProfSymtab::getFunction(uint64_t FuncMD5Hash) {
>> +  auto Result =
>> +      std::lower_bound(MD5FuncMap.begin(), MD5FuncMap.end(), FuncMD5Hash,
>> +                       [](const std::pair<uint64_t, Function*> &LHS,
>> +                          uint64_t RHS) { return LHS.first < RHS; });
>> +  if (Result != MD5FuncMap.end() && Result->first == FuncMD5Hash)
>> +    return Result->second;
>> +  return nullptr;
>> +}
>> +
>> // See also getPGOFuncName implementation. These two need to be
>> // matched.
>> StringRef InstrProfSymtab::getOrigFuncName(uint64_t FuncMD5Hash) {
>> 
>> Modified: llvm/trunk/lib/ProfileData/InstrProf.cpp
>> URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/ProfileData/InstrProf.cpp?rev=264902&r1=264901&r2=264902&view=diff
>> ==============================================================================
>> --- llvm/trunk/lib/ProfileData/InstrProf.cpp (original)
>> +++ llvm/trunk/lib/ProfileData/InstrProf.cpp Wed Mar 30 13:37:52 2016
>> @@ -83,9 +83,32 @@ std::string getPGOFuncName(StringRef Raw
>>   return GlobalValue::getGlobalIdentifier(RawFuncName, Linkage, FileName);
>> }
>> 
>> -std::string getPGOFuncName(const Function &F, uint64_t Version) {
>> -  return getPGOFuncName(F.getName(), F.getLinkage(), F.getParent()->getName(),
>> -                        Version);
>> +// Return the PGOFuncName. This function has some special handling when called
>> +// in LTO optimization. The following only applies when calling in LTO passes
>> +// (when \c InLTO is true): LTO's internalization privatizes many global linkage
>> +// symbols. This happens after value profile annotation, but those internal
>> +// linkage functions should not have a source prefix.
>> +// To differentiate compiler generated internal symbols from original ones,
>> +// PGOFuncName meta data are created and attached to the original internal
>> +// symbols in the value profile annotation step
>> +// (PGOUseFunc::annotateIndirectCallSites). If a symbol does not have the meta
>> +// data, its original linkage must be non-internal.
>> +std::string getPGOFuncName(const Function &F, bool InLTO, uint64_t Version) {
>> +  if (!InLTO)
>> +    return getPGOFuncName(F.getName(), F.getLinkage(), F.getParent()->getName(),
>> +                          Version);
>> +
>> +  // InLTO mode. First check if these is a meta data.
>> +  MDNode *MD = F.getMetadata("PGOFuncName");
>> +  if (MD != nullptr) {

nit: This would be more compact as `if (MDNode *MD = F.getMetadata(...)) {`.


>> +    StringRef S = cast<MDString>(MD->getOperand(0))->getString();
>> +    return S.str();
>> +  }
>> +
>> +  // If there is no meta data, the function must be a global before the value
>> +  // profile annotation pass. Its current linkage may be internal if it is
>> +  // internalized in LTO mode.
>> +  return getPGOFuncName (F.getName(), GlobalValue::ExternalLinkage, "");

nit: Please clang-format this last line.


>> }
>> 
>> StringRef getFuncNameWithoutPrefix(StringRef PGOFuncName, StringRef FileName) {
>> @@ -149,9 +172,16 @@ GlobalVariable *createPGOFuncNameVar(Fun
>>   return createPGOFuncNameVar(*F.getParent(), F.getLinkage(), PGOFuncName);
>> }
>> 
>> -void InstrProfSymtab::create(const Module &M) {
>> -  for (const Function &F : M)
>> -    addFuncName(getPGOFuncName(F));
>> +void InstrProfSymtab::create(Module &M, bool InLTO) {
>> +  for (Function &F : M) {
>> +    // Function may not have a name: like using asm("") to overwrite the name.
>> +    // Ignore in this case.
>> +    if (!F.hasName())
>> +      continue;
>> +    const std::string &PGOFuncName = getPGOFuncName(F, InLTO);
>> +    addFuncName(PGOFuncName);
>> +    MD5FuncMap.push_back(std::make_pair(Function::getGUID(PGOFuncName), &F));

nit: How about `MD5FuncMap.emplace_back(Function::getGUID(...), &F)`?


vedant

>> +  }
>> 
>>   finalizeSymtab();
>> }
>> 
>> Modified: llvm/trunk/lib/Transforms/Instrumentation/PGOInstrumentation.cpp
>> URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Transforms/Instrumentation/PGOInstrumentation.cpp?rev=264902&r1=264901&r2=264902&view=diff
>> ==============================================================================
>> --- llvm/trunk/lib/Transforms/Instrumentation/PGOInstrumentation.cpp (original)
>> +++ llvm/trunk/lib/Transforms/Instrumentation/PGOInstrumentation.cpp Wed Mar 30 13:37:52 2016
>> @@ -750,6 +750,15 @@ void PGOUseFunc::annotateIndirectCallSit
>>   if (DisableValueProfiling)
>>     return;
>> 
>> +  // Write out the PGOFuncName if this is different from it's raw name.
>> +  // This should only apply to internal linkage functions only.
>> +  const std::string &FuncName = getPGOFuncName(F);
>> +  if (FuncName != F.getName()) {
>> +    LLVMContext &C = F.getContext();
>> +    MDNode *N = MDNode::get(C, MDString::get(C, FuncName.c_str()));
>> +    F.setMetadata("PGOFuncName", N);
>> +  }
>> +
>>   unsigned IndirectCallSiteIndex = 0;
>>   PGOIndirectCallSiteVisitor ICV;
>>   ICV.visit(F);
>> 
>> 
>> _______________________________________________
>> llvm-commits mailing list
>> llvm-commits at lists.llvm.org
>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-commits
> _______________________________________________
> llvm-commits mailing list
> llvm-commits at lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-commits



More information about the llvm-commits mailing list