[compiler-rt] [llvm-profdata] Do not create numerical strings for MD5 function names read from a Sample Profile. (PR #66164)

William Junda Huang via llvm-commits llvm-commits at lists.llvm.org
Wed Oct 4 18:30:17 PDT 2023


huangjd wrote:

To clarify the implementation of ProfileFuncRef and how it guarantees correctness. 

There are two ways ProfileFuncRef can originate (ignoring copy constructions):
1 - From reading the profile, in which all ProfileFuncRef should have the same representation (either StringRef or MD5) from the source. A profile mixing both is currently considered malformed so we don't care what happens next. I am going to address this issue in a next patch to make the reader more robust.
2 - From the module's function (`Func`) names. Function names are always StringRef.

To ensure correctness, the following conventions need to be enforced:
A - Whenever ProfileFuncRef is used as key in a map, all keys should have the same representation. This excludes the scenario that both representations of the same function are inserted as different entries.
B - Whenever we lookup a ProflieFuncRef-keyed map, we need to make sure the key we query has the same representation as the keys in the map.

The way this work is to use the function `getRepInFormat()`, which checks `bool FunctionSamples::UseMD5`. This flag is set immediately after reading the profile, and before any ProflieFuncRef-keyed map (outside of the sample profile reader) is created. Whenever we perform the action in A or B, if the key originates from 2 then it is converted to the representation matching those originates from 1 using `getRepInFormat()`. 

Exception - If we are using HashKeyMap, then we don't need these requirements because both StringRef and MD5 representation of the same function have the same hash value, so a lookup using either form will get the match. (Assume we ignore real hash collision) 





As for the concern of map look ups in general, where the map is populated with one representation, 

https://github.com/llvm/llvm-project/pull/66164


More information about the llvm-commits mailing list