[PATCH] D92621: [SampleFDO] Store fixed length MD5 in NameTable instead of using ULEB128 if MD5 is used.

Mon Dec 7 11:12:01 PST 2020

wmi added a comment.

In D92621#2435924 <https://reviews.llvm.org/D92621#2435924>, @wenlei wrote:

>> We find that after this change, the elapsed time to build a large application distributively is reduced by 5% and the accumulative cpu time used for building is also reduced by 5%.
>
> This is great, thanks for the patch! Just to clarify, did you mean 5% of end to end ThinLTO+AutoFDO build? I'm curious what is the total profile reading time your saw in terms of %? and how different that % is between md5 profile vs non-md5 profile? Currently we mostly use non-md5 profile, assuming md5 profile is more about saving profile size.

Yes, it is 5% saving of end to end ThinLTO + AutoFDO build time.

Each module compiling has different % of profile reading time. The percentage is negligible for large source module but can be significant for small source module. I tried it on a very small file.

  The build time using md5 profile was 0.55 second. 
  The build time using non-md5 profile was 0.75 second. 
  The build time using fixlenmd5 profile was 0.15 second. 
  The build time not using any profile was 0.1 second. 

Because for a large project, many source files are small or medium sizes so the building time for small files matters especially for build cpu resource. It also matters for end-to-end build time but usually not as significant. For the experiment I did (a 20M binary profile), the end-to-end build time saving was about the same as the aggregate build cpu resource saving (both were 5%), but from our experience using larger profile (like 300M profile), it will affect aggregate build cpu resource more than end-to-end build time.

================
Comment at: llvm/lib/ProfileData/SampleProfReader.cpp:380
+    SR = MD5StringBuf->back();
+    Data = SavedData;
+  }
----------------
wenlei wrote:
> If we return error code before reaching this point, `Data` pointer will not be restored, is that ok? What about using RAII to make sure restoration always happens? 
> I also noticed this is not the only place where we do save/restore for `Data` pointer, but other instances are not part of this change.
When error happens, the reader/writer won't proceed and will return to caller calling read/write API interface, so the inconsistent Data pointer doesn't matter. The reason we don't want to issue a fatal error is the caller may want to skip the processing of the current problematic profile and proceed to the next one.    

================
Comment at: llvm/lib/ProfileData/SampleProfReader.cpp:729
     return EC;
-  NameTable.reserve(*Size);
   MD5StringBuf = std::make_unique<std::vector<std::string>>();
----------------
hoy wrote:
> Nit: this might still be helpful to the non-fixed MD5 path.
Good catch. 

================
Comment at: llvm/lib/ProfileData/SampleProfReader.cpp:769
+    // check using the size of NameTable.
+    NameTable.resize(*Size + NameTable.size());
+
----------------
hoy wrote:
> Nit: should `NameTable` be always empty right before here? Could an assert be useful?
Currently NameTable is always empty before here but it is possible we have multiple NameTable sections. I have a followup NFC to support that.  

================
Comment at: llvm/lib/ProfileData/SampleProfReader.cpp:372
+  // read NameTable index.
+  auto Idx = readMD5StringIndex();
+  if (std::error_code EC = Idx.getError())
----------------
wenlei wrote:
> wmi wrote:
> > hoy wrote:
> > > Can this be just `readStringIndex(NameTable)`  since `NameTable` is preallocated?
> > Good point. That makes the code simpler. 
> Actually it looks like both fixed path and ULEB128 path need to access `NameTable[*Idx]` in addition to `readStringIndex(NameTable)`? What about just call `SampleProfileReaderBinary::readStringFromTable` for both, and diverge only after that - return for ULEB128 path, check error or empty StringRef for fixed path?
For fixed length MD5 path, we want to get a reference of NameTable[*Idx] and change it after we read the real name from memory. This is different from the ULEB128 path, where NameTable has been populated up front, and readStringIndex will only return a copy of StringRef. 

Repository:
  rL LLVM

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D92621/new/

https://reviews.llvm.org/D92621