[PATCH] D108435: [CSSPGO] split context string II - reader/writer changes

Wenlei He via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Thu Aug 26 17:34:13 PDT 2021


wenlei added inline comments.


================
Comment at: llvm/include/llvm/ProfileData/SampleProfReader.h:135
 //
 // Binary format
 // -------------
----------------
In this section, we need to add documentation/spec for CSNameTable how context is stored, and how it interacts with NameTable.


================
Comment at: llvm/include/llvm/ProfileData/SampleProfReader.h:507
   /// to their corresponding profiles.
-  StringMap<FunctionSamples> Profiles;
+
+  SampleProfileMap Profiles;
----------------
nit: remove the blank line


================
Comment at: llvm/include/llvm/ProfileData/SampleProfReader.h:579
+
+  /// Create a context vector from a give context string and save it in
+  /// CSNameTable.
----------------
typo: given


================
Comment at: llvm/lib/ProfileData/SampleProfReader.cpp:326-327
+        // otherwise it's treated as context-less function name only.
+        auto Context = createContext(FName);
+        FContext.setContext(Context);
         ++CSProfileCount;
----------------
Even though we only need to deal with string context in profile reader/writer for text profile, it's probably still cleaner to keep all string context related parsing into SampleContext. 

`createContext` is more like a ctor. I'd prefer keep string decoding, createContext in its original place in SampleContext. That way, we can construct a context from string, SampleContext::setContext remain a private helper too, and the logic here can be simpler, just like before.

SampleContext does still have getContextString and toString, so it's not really isolated from string representation, might as well keep all string stuff together there for consistency. 


================
Comment at: llvm/lib/ProfileData/SampleProfReader.cpp:496
 ErrorOr<StringRef> SampleProfileReaderBinary::readStringFromTable() {
-  auto Idx = readStringIndex(NameTable);
+  auto NTable = getNameTable();
+  assert(NTable && "NTable should not be nullptr");
----------------
I think `getNameTable` exist as a virtual function only because of the use in sample loader for `profile-sample-accuarate` and `profile-accurate-for-symsinlist`. Here can we still access the NameTable directly without going through virtual call? Same for other places where `NameTable` is directly accessible. For writer, before this change, we indeed only have `NameTable` for `SampleProfileWriterBinary` without virtual getter, and we access it without going through virtual function call there. 


================
Comment at: llvm/lib/ProfileData/SampleProfReader.cpp:632
+std::error_code SampleProfileReaderBinary::readFuncProfile(const uint8_t *Start,
+                                                           bool IsContextName) {
   Data = Start;
----------------
Same here, avoid `IsContextName` as param, but dispatch based on `SampleProfileReader::ProfileIsCS`


================
Comment at: llvm/lib/ProfileData/SampleProfReader.cpp:638
 
-  auto FName(readStringFromTable());
-  if (std::error_code EC = FName.getError())
+  auto FContext(readNameFromTable(IsContextName));
+  if (std::error_code EC = FContext.getError())
----------------
nit: this can be confusing, readNameFromTable can mislead people to think FContext is string (or vector of strings) type. the auto also isn't helping. Either spell out the type name, or rename `readNameFromTable` to something like `readSampleContextFromTable`




================
Comment at: llvm/lib/ProfileData/SampleProfReader.cpp:800
+      if (std::error_code EC =
+              readFuncProfile(Data, FunctionSamples::ProfileIsCS))
         return EC;
----------------
Since we have `SampleProfileReader::ProfileIsCS`, within reader it probable makes more sense to use the reader instance flag as opposed to the global FunctionSamples::ProfileIsCS flag. 

We were not very consistent in past, might be good to clean up as you're expanding the use of ProfileIsCS. 


================
Comment at: llvm/lib/ProfileData/SampleProfReader.cpp:828
       // iterating through the ordered names.
-      struct Comparer {
-        // Ignore the closing ']' when ordering context
-        bool operator()(const StringRef &L, const StringRef &R) const {
-          return L.substr(0, L.size() - 1) < R.substr(0, R.size() - 1);
-        }
-      };
-      std::set<StringRef, Comparer> OrderedNames;
+      std::set<SampleContext> OrderedNames;
       for (auto Name : FuncOffsetTable) {
----------------
OrderedNames -> OrderedContexts


================
Comment at: llvm/lib/ProfileData/SampleProfReader.cpp:1098
+
+      Context.emplace_back(SampleContextFrame(
+          FName.get(), LineLocation(LineOffset.get(), Discriminator.get())));
----------------
Let emplace_back take variadic arguments and forward to the constructor directly instead of doing a move copy?


================
Comment at: llvm/lib/ProfileData/SampleProfReader.cpp:1102
+
+    PNameVec->emplace_back(Context);
+  }
----------------
How about we emplace an empty slot before going into the inner loop, then we can operate on the vector directly (`PNameVec.back().emplace_back(...)`) and avoid copying temporary `Context` onto `PNameVec`?


================
Comment at: llvm/lib/ProfileData/SampleProfReader.cpp:1117
 
-    SampleContext FContext(*FName);
-    bool ProfileInMap = Profiles.count(FContext);
+    bool ProfileInMap = Profiles.count(FContext.get());
 
----------------
nit: peal/hoist the `get()` so we don't have to call it for every use.


================
Comment at: llvm/lib/ProfileData/SampleProfReader.cpp:506
+ErrorOr<SampleContext>
+SampleProfileReaderBinary::readNameFromTable(bool IsContextName) {
+  auto FName(readStringFromTable());
----------------
hoy wrote:
> hoy wrote:
> > wmi wrote:
> > > This param is unused. Write it as readNameFromTable(bool /* IsContextName */) to make it more explicit. 
> > Sounds good.
> Actually I can only do that in the header file for the function declaration. The function definition needs a real name for param.
Can we avoid this parameter altogether? For one profile, it's either going to be CS or non-CS, so the dispatch can be based on state/member of the reader instead of relying on a param for each invocation. 


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D108435/new/

https://reviews.llvm.org/D108435



More information about the llvm-commits mailing list