[PATCH] D90125: [CSSPGO] Infrastructure for context-sensitive Sample PGO and Inlining

Wenlei He via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Wed Nov 11 12:58:21 PST 2020


wenlei added inline comments.


================
Comment at: llvm/include/llvm/ProfileData/SampleProf.h:361
+// Example of full context string (note the wrapping `[]`):
+//    `[main:3 @ _Z5funcAi:1 @ _Z8funcLeafi]`
+// Example of context-less function name (same as AutoFDO):
----------------
davidxl wrote:
> wenlei wrote:
> > davidxl wrote:
> > > wenlei wrote:
> > > > wenlei wrote:
> > > > > davidxl wrote:
> > > > > > The syntax of the context string can probably be made more consistent like:
> > > > > > 
> > > > > > SampleContext : LeafContext
> > > > > >                            : ParentContext   LeafContext
> > > > > > 
> > > > > > LeafContext: function_name
> > > > > > ParentContext: [ParentFrames]
> > > > > > ParentFrames:   OneParentFrame   
> > > > > >                         : ParentFrames   OneParentFrame
> > > > > > OneParentFrame:  function_name:line @
> > > > > > 
> > > > > > So in your example, the full context string should look like:
> > > > > > 
> > > > > > [main:3 @_Z5funcAi:1 @] _Z8funcLeafi
> > > > > > 
> > > > > Agreed, that's indeed more consistent and I like that better. Thanks for the suggestion. Will make the change (llvm-profgen change will follow too).
> > > > Actually, there're other implication if we were to change context string to be `[context] leaf`. With the currently syntax, when we promote context, new context string is a substring of the original one. So we just create StringRef wrapper for context promotion without creating new strings. 
> > > > 
> > > > E.g. when `main:3 @ _Z5funcAi:1 @ _Z8funcLeafi` is promoted, it becomes `_Z5funcAi:1 @ _Z8funcLeafi` which is sub string of the original one (StringRef that reuses the underlying string).
> > > > 
> > > > If we use `[main:3 @_Z5funcAi:1 @] _Z8funcLeafi`, promotion will lead to something like `[_Z5funcAi:1 @] _Z8funcLeafi`, which is no longer a substring and a new string need to be created. We use two StringRef to represent context part and leaf part today, but we also need a consistent string representation for the full context `getNameWithContext`.
> > > > 
> > > > So practically, current syntax is more efficient for context promotion.
> > > > 
> > > > Additionally, with the proposed syntax, top level context would look like this `[] main` to differentiate from context-less header. In this case, `[main]` is probably better.
> > > > 
> > > > What do you think?
> > > > 
> > > > Actually, there're other implication if we were to change context string to be `[context] leaf`. With the currently syntax, when we promote context, new context string is a substring of the original one. So we just create StringRef wrapper for context promotion without creating new strings. 
> > > > 
> > > > E.g. when `main:3 @ _Z5funcAi:1 @ _Z8funcLeafi` is promoted, it becomes `_Z5funcAi:1 @ _Z8funcLeafi` which is sub string of the original one (StringRef that reuses the underlying string).
> > > 
> > > In this case, where does the bracket go?
> > > 
> > > > 
> > > > If we use `[main:3 @_Z5funcAi:1 @] _Z8funcLeafi`, promotion will lead to something like `[_Z5funcAi:1 @] _Z8funcLeafi`, which is no longer a substring and a new string need to be created. We use two StringRef to represent context part and leaf part today, but we also need a consistent string representation for the full context `getNameWithContext`.
> > > > 
> > > > So practically, current syntax is more efficient for context promotion.
> > > > 
> > > > Additionally, with the proposed syntax, top level context would look like this `[] main` to differentiate from context-less header. In this case, `[main]` is probably better.
> > > 
> > > what is the difference between top level context vs context less header?
> > > 
> > > 
> > > > 
> > > > What do you think?
> > > > 
> > > 
> > > 
> > > In this case, where does the bracket go?
> > 
> > Internally, we don't use the bracket, and since they're the first and last character, we just create a StringRef with bracket removed for `SampleContext`, without creating new string.
> > 
> > So in profile file, we have `[main:3 @ _Z5funcAi:1 @ _Z8funcLeafi]`, while in `SampleContext` used by LLVM and tools, we have `main:3 @ _Z5funcAi:1 @ _Z8funcLeafi`. The form used by `SampleContext` is easier for context promotion, and the conversion from `[]` form to `SampleContext` is just a StringRef wrapper.
> > 
> > > what is the difference between top level context vs context less header? 
> > 
> > They have different meanings, consider a dso, context-less profile `foo` means we just don't know the calling context, while context profile `[foo]` means this is called directly from external function. 
> > 
> > In practice, we don't have context-less profile and context profile in a single profile file now, so it's also about consistency in context profile and enough differentiation between context profile and context-less profile (i.e reserve the form `main` *only* for context-less profile for today's AFDO).
> > > In this case, where does the bracket go?
> > 
> > Internally, we don't use the bracket, and since they're the first and last character, we just create a StringRef with bracket removed for `SampleContext`, without creating new string.
> > 
> > So in profile file, we have `[main:3 @ _Z5funcAi:1 @ _Z8funcLeafi]`, while in `SampleContext` used by LLVM and tools, we have `main:3 @ _Z5funcAi:1 @ _Z8funcLeafi`. The form used by `SampleContext` is easier for context promotion, and the conversion from `[]` form to `SampleContext` is just a StringRef wrapper.
> 
> That is what I was thinking -- for external format, we need to make it well defined and as consistent as possible, while for internal representation, any format is fine. Here is the question:
> 1) is there a need to share external and internal rep? Once the external strings are parsed they can be discarded
> 2) for internal format, is there more compact form ? Is there a need to use string ?
> > 
> > > what is the difference between top level context vs context less header? 
> > 
> > They have different meanings, consider a dso, context-less profile `foo` means we just don't know the calling context, while context profile `[foo]` means this is called directly from external function. 
> > 
> > In practice, we don't have context-less profile and context profile in a single profile file now, so it's also about consistency in context profile and enough differentiation between context profile and context-less profile (i.e reserve the form `main` *only* for context-less profile for today's AFDO).
> 
> 
> for internal format, is there more compact form ? Is there a need to use string ? 

This is something we thought about too. We were thinking about something along the lines of a rolling hash (integer encoding that is friendly to context promotion operation) to eliminate StringRef. But actually with current implementation, it's quite compact already because it's always StringRef and we never need to create any new string (the order of context syntax, root to leaf, was also intentional to make promoted context substring of original context). So we're happy with it.

> is there a need to share external and internal rep? Once the external strings are parsed they can be discarded 

Sharing between internal and external format isn't must have, but I think it's nice to have if cost is minimal. 

External strings can be discarded but that would require non-trivial framework changes that affects AFDO too. Currently, for AFDO (and CSSPGO) we keep profile file in a memory buffer, and all external strings are StringRef, which is used throughout SampleProfileReader and hence SampleProfileLoader, assuming the underlying strings are always available. E.g. FunctionSamples::Name is StringRef wrapped around external string from the memory buffer of input profile. CSSPGO uses the same framework, and all of that would need to be changed if we want to discard external strings (essentially freeing the memory buffer after loading profile). 

This (freeing the memory buffer after loading profile for both AFDO and CSSPGO) feels like more of an optimization rather than part of the CSSPGO framework, and if we do that optimization later, we could change internal representation accordingly. What do you think?



Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D90125/new/

https://reviews.llvm.org/D90125



More information about the llvm-commits mailing list