[PATCH] D145171: [FSAFDO] Improve FS discriminator encoding

Rong Xu via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Wed Mar 8 13:44:25 PST 2023


xur added a comment.

In D145171#4174356 <https://reviews.llvm.org/D145171#4174356>, @wenlei wrote:

>> This reduces the hash conflicts.
>
> Curious how did you check/detect conflicts/collisions?

Most from eyeballing the change -- I manually look at some functions.
The other indicator was the number of discriminators created in some lines in templates header.  For some line, we overflow the bits.

>> one should expect a performance gain over the binary without this change.
>
> How big is the gain you saw?

We measured this for a variety of programs -- we are seeing improvement range from 0.7% to 2.0% on top of current FSAFDO.
The gain usually is higher for iterative profiles. (This is with the BBSize in the hash).



================
Comment at: llvm/include/llvm/IR/DebugInfoMetadata.h:1924
+  /// return scope name (the demangled name).
+  const StringRef getName() const {
+    DISubprogram *SP = getScope()->getSubprogram();
----------------
hoy wrote:
> nit: name it `getSubprogramLinkageName`? 
This is a better name. I will change to this.


================
Comment at: llvm/lib/CodeGen/MIRFSDiscriminator.cpp:67
   for (DIL = DIL->getInlinedAt(); DIL; DIL = DIL->getInlinedAt()) {
     Ret ^= updateHash(std::to_string(DIL->getLine()));
     Ret ^= updateHash(DIL->getScope()->getSubprogram()->getLinkageName());
----------------
wenlei wrote:
> same here, converting to string first seem unnecessary. 
Will switch to xxHash64.


================
Comment at: llvm/lib/CodeGen/MIRFSDiscriminator.cpp:77
+      return 0;
+    return MD5Hash(Str);
+  };
----------------
wenlei wrote:
> MD5 as cryptographic hash is expensive, `xxHash64` should be good enough in terms of distribution and collision avoidance. Now that we're changing discriminator algorithm, wondering if we should take the opportunity to move to xxhash for the new version. (For compilation with compact-binary profile where MD5 is used a lot, MD5 shows up quite hot in perf profile)
This is a good idea. I think this is a good opportunity to switch to less expensive hash algorithm. 


================
Comment at: llvm/lib/CodeGen/MIRFSDiscriminator.cpp:139
+    if (ImprovedFSDiscriminator)
+      BBSizeHash = MD5Hash(std::to_string(BBSize(BB)));
+
----------------
wenlei wrote:
> converting int to string just to get an hash seems like an overkill. `xxHash64(ArrayRef<uint8_t> Data)` should be fast and good enough.  
Got it. I will probably remove BBSize from the hash.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D145171/new/

https://reviews.llvm.org/D145171



More information about the llvm-commits mailing list