[PATCH] D108342: [CSSPGO] Enable loading MD5 CS profile.

Hongtao Yu via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Mon Aug 30 14:07:10 PDT 2021


hoy added a comment.

In D108342#2973234 <https://reviews.llvm.org/D108342#2973234>, @wenlei wrote:

> In D108342#2973176 <https://reviews.llvm.org/D108342#2973176>, @hoy wrote:
>
>> In D108342#2971668 <https://reviews.llvm.org/D108342#2971668>, @wenlei wrote:
>>
>>>> The work on llvm-profgen side is not included. Currently I'm relying on llvm-profdata to get a md5 cs profile.
>>>
>>> Is there blocker for this to be done in llvm-profgen? I imagine if llvm-profdata can generate md5 cs profile, llvm-profgen should be no different by reusing the writer to handle this?
>>
>> Yes, that's the way we were using previously. There is an optimization we can do before it comes to the writer for pseudo probe. Pseudo probe comes with GUID so we can use that inside in the profile generator and unwinder. I think @wlei had a change for that which we can land separately.
>
> Okay, but is it not working at at all without Lei's optimization, or is it too slow? Asking because I'm wondering why you had to use llvm-profdata to generate md5 profile, instead of just let writer take care of md5 from llvm-profgen directly.

I had to use llvm-profdata simply because of the support of md5 in llvm-profgen was not ready. It shouldn't be too slow now for llvm-profgen to use real names with the context split work. We could give Lei's approach a shot to see how much win can be get.



================
Comment at: llvm/lib/ProfileData/SampleProfReader.cpp:1049
     return EC;
+  FunctionSamples::UseMD5 = true;
   MD5StringBuf = std::make_unique<std::vector<std::string>>();
----------------
wenlei wrote:
> hoy wrote:
> > wenlei wrote:
> > > Can we move `FunctionSamples::UseMD5 = useMD5()` to be before `readImpl()` inside `read()` and then avoid this extra UseMD5 setting?
> > Actually `useMD5` relies on `readImpl() ` to be set. It is until the reader actually reads the SecNameTable we know we are reading a md5 profile or not.  The reader checks the SecFlagMD5Name flag to see if SecNameTable is filled with real names or md5 names.
> > 
> > 
> Ok.. Taking another look, where in reader do we depend on `FunctionSamples::UseMD5` though? With the current definition (pasted below) for useMD5(), we could check useMD5() within reader, then we avoid this special case here?
> 
> Similar to ProfileIsCS, in general it's more consistent to use reader flag while we're in reader. 
> 
> ```
> bool useMD5() override { return MD5StringBuf.get(); }
> ```
Good question. The dependency is from using `FunctionSamples::getGUID` when loading func profiles. That method checks if `FunctionSamples::UseMD5` is set.

  // Assume the input \p Name is a name coming from FunctionSamples itself.
  // If UseMD5 is true, the name is already a GUID and we
  // don't want to return the GUID of GUID.
  static uint64_t getGUID(StringRef Name) {
    return UseMD5 ? std::stoull(Name.data()) : Function::getGUID(Name);
  }

We can just use `std::stoull(Name.data())` instead to avoid this change.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D108342/new/

https://reviews.llvm.org/D108342



More information about the llvm-commits mailing list