[PATCH] D96932: [SampleFDO] Support enabling -funique-internal-linkage-name

Wei Mi via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Mon Feb 22 17:52:06 PST 2021


wmi added a comment.

>>> For context tracker, we have canonicalization done in llvm-profgen as part of profile generation which is why we didn't canonicalize on the compiler side.. Now with the new model that let compiler decide whether to strip suffix based on section flag, we should skip canonicalization llvm-profgen, does that sound right? I guess your profile generation tool doesn't do name canonicalization?
>>
>> Another usage of canonicalization is to merge profile. For example, suppose we have a function clone from const propagation, and we want to merge the profile from the clone with the profile from its original body. It is easier to handle it through llvm-profgen and make the profile size smaller before feed it into compiler. Our profile generation tool does name canonicalization.
>
> Good point about the cloned functions. I'm wondering if the canonicalization should be uniformly done in the compiler in every place does name mapping, i.e, from Function to Profile or backwards. The profile generator may not have a full view of the IR and it may not strip away the suffixes accurately without seeing the `sample-profile-suffix-elision-policy` attribute.  cc @wlei

Right, that is valid concern. I remember the function attribute was added because Go lang was trying to add some SampleFDO support and they wanted to use a different elision policy than C++, but they havn't really used it, so we currently get away with it by using a same elision strategy for all symbols when we process a profile. I agree it is better to do all the canonicalization work in compiler. Profile merging is less of a concern.



================
Comment at: llvm/include/llvm/ProfileData/SampleProf.h:774
   static StringRef getCanonicalFnName(StringRef FnName, StringRef Attr = "") {
-    static const char *knownSuffixes[] = { ".llvm.", ".part." };
+    static const char *knownSuffixes[] = {".llvm.", ".part.", ".__uniq."};
     if (Attr == "" || Attr == "all") {
----------------
hoy wrote:
> tmsriram wrote:
> > As an aside, where does ".part." come from?
> Can you please add a comment about the order of these suffixes? Looks like it's in reversed appending order, i.e., "._uniq." is always the first suffix to the original name. 
It is there for a long time. I guess it is copied from gcc because gcc has such suffix like '.isra.' and '.part.' . I cannot find the suffix in llvm code base. I just notice that bb section add '.__part.' suffix and I havn't understand in which case we add the '.__part.' suffix. If it needs some change as well, we can address it in a separate patch. 


================
Comment at: llvm/include/llvm/ProfileData/SampleProf.h:784
+        if (Suffix == ".__uniq." && FunctionSamples::HasUniqSuffix)
+          continue;
         auto It = Cand.rfind(Suffix);
----------------
tmsriram wrote:
> ```
> Do we need a test where "llvm", "part" and "__uniq" appear in the symbol name?  Also, isn't it "__part"?
> ```
I updated the test to include the case with name like "symbol.__uniq.123.llvm.456"


================
Comment at: llvm/include/llvm/ProfileData/SampleProfReader.h:458
 
+  /// Return whether any name in the profile contains ".uniq." suffix.
+  virtual bool hasUniqSuffix() { return false; }
----------------
tmsriram wrote:
> ```
> typo: "__uniq"
> ```
Thanks for the catch. Fixed.


================
Comment at: llvm/lib/ProfileData/SampleProf.cpp:38
+bool FunctionSamples::UseMD5 = false;
+bool FunctionSamples::HasUniqSuffix = true;
 } // namespace sampleprof
----------------
tmsriram wrote:
> Why is this true by default?
Because for extbinary profile format, we can have the flag in the binary showing the profile has name with ".__uniq." suffix, so nothing to be worry about.

However, for other types of format, we don't have the flag to showing that. We can examine each name when we read the profile to see whether there is the suffix, but that will increase compile time. The solution is, for other format, we make the default behavior to assume there is ".__uniq." suffix (i.e. make FunctionSamples::HasUniqSuffix default true). It will have profile mismatch when the profile doesn't contain ".__uniq." suffix while the IR in the optimized build contains ".__uniq." suffix. That will be a transient regression for the format other than extbinary. I think that is the tradeoff to minimize impact for profile other than extbinary format when we enable -funique-internal-linkage-name by default.   


================
Comment at: llvm/lib/ProfileData/SampleProf.cpp:251
+    } else {
+      CalleeName = getCanonicalFnName(CalleeName, "selected");
+    }
----------------
tmsriram wrote:
> Could you also add a code comment on how this default could be wrong if at all and its implications?
I added a comment for it.


================
Comment at: llvm/lib/ProfileData/SampleProfWriter.cpp:211
+  for (const auto &I : NameTable) {
+    if (I.first.find(".__uniq.") != StringRef::npos) {
+      addSectionFlag(SecNameTable, SecNameTableFlags::SecFlagUniqSuffix);
----------------
tmsriram wrote:
> ```
> Another suggestion to store __uniq as a constant somewhere and reuse that .
> ```
Good point. Done. 


================
Comment at: llvm/lib/Transforms/IPO/SampleContextTracker.cpp:212
+  }
+
   ContextTrieNode *CalleeContext = getCalleeContextFor(DIL, CalleeName);
----------------
tmsriram wrote:
> Saw this code snippet for the second time, also used in SampleProf.cpp. Could we refactor to one function?
Good suggestion. Done.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D96932/new/

https://reviews.llvm.org/D96932



More information about the llvm-commits mailing list