<table border="1" cellspacing="0" cellpadding="8">
    <tr>
        <th>Issue</th>
        <td>
            <a href=https://github.com/llvm/llvm-project/issues/58215>58215</a>
        </td>
    </tr>

    <tr>
        <th>Summary</th>
        <td>
            [InstrProf] Duplicated function entries in profile with LTO/ThinLTO enabled in instrumentation build
        </td>
    </tr>

    <tr>
      <th>Labels</th>
      <td>
            new issue
      </td>
    </tr>

    <tr>
      <th>Assignees</th>
      <td>
            xur-llvm
      </td>
    </tr>

    <tr>
      <th>Reporter</th>
      <td>
          xur-llvm
      </td>
    </tr>
</table>

<pre>
    We found that there are duplicated function entries in the profile when the instrumentation build with LTO/ThinLTO enabled. For example, in the clang's ThinLTO enabled PGO profile , there are 336 copies of _ZN4llvm2cl6OptionC2ENS0_18NumOccurrencesFlagENS0_12OptionHiddenE with the same hash.
This is wrong.

The root cause is we private the profd variable associated with a COMDAT function if it's renamed (with a hash postfix). This is fine in non-LTO build, since we still have the COMDAT group in place. Linker will choose the prevailing definition and remove other copies all together.

For the LTO build, the COMDAT group (which includes profd and profc etc) might be dissolved. The "profc" variable is fine as it still has a linkonce_odr linkage -- we will have one prevailing copy. The "profd", with a private linkage,  will not have any chance to be de-duplicated.

When dumping the raw profile, we iterate all the profd and have duplicated profile counts for the same function.

The following example shows the problem:
>> cat c.h
extern void bar();
inline __attribute__((noinline)) void foo() {
}

>> cat m1.cc
#include "c.h"
int main() {
  bar();
  foo();
  return 0;
}

>> cat m2.cc
#include "c.h"
__attribute__((noinline)) void bar() {
    foo();
}

>> clang -O2 -flto=thin m1.cc m2.cc -o t_gen -fuse-ld=lld -fprofile-generate=./t
>> ./t_gen && llvm-profdata show -function=foo t/default_*.profraw
Counters:
  _Z3foov:
    Hash: 0x0a4d0ad3efffffff
    Counters: 1
  _Z3foov:
    Hash: 0x0a4d0ad3efffffff
    Counters: 1
Instrumentation level: IR  entry_first = 0
Functions shown: 2
Total functions: 4
Maximum function count: 2
Maximum internal block count: 0

>> clang -O2 m1.cc m2.cc -o p_gen -fuse-ld=lld -fprofile-generate=./p
>> ./p_gen && llvm-profdata show -function=foo p/default_*.profraw
Counters:
  _Z3foov:
    Hash: 0x0a4d0ad3efffffff
    Counters: 1
    Block counts: [2]
Instrumentation level: IR  entry_first = 0
Functions shown: 1
Total functions: 3
Maximum function count: 2
Maximum internal block count: 0

The profile from LTO build is wrong.

I think the fix would be not marking "profd" variable as private linkage for COMDAT.

</pre>
<img width="1px" height="1px" alt="" src="http://email.email.llvm.org/o/eJzFVttu4zgM_RrnhXDgyLk-5KFt2p0CM-1gt8AA-xLItmxrK0uGLCfp3y8p27n0hhlgB1u4iSNR5OHhEaXEZC_rHwJy0-oMXMkdfggrgON_1tZKptyJDPJWp04aDUI7K0UDUpMh1NbkUgnYl6IbkLpxtq3QjHv7pJUqg710JXx9egzY3VMpNb6hI54okY3hzlgQB17VSgTsZnCcKq6LgC0aeLUAvv_xeAxLC05443gOqakJnslh-_fDVKldxVI1f6wJzA27ffgr2k6WD231mKattUKnorlTvOgmWGf3RWaZ0LcdagLT8EpAyZtyHESbILpCSMhAA3trdNGPDTMCrDEOUt42wtsQSXKHLB4Jy2DHraRsgDeNSaWn2EfjcPP4bXP1dCJc5iCdJwLRIo4Mk172tgQJatO4XB4CthrDACyXmkoB2uiQuPNVILIaiRkTpsZJpdDBroPVRy2saWtaWCueijF8lfpZWISGtmlpTDMkIXZcKqkLyATGkh4pRwVZURl0aagoQy04LnamEDR2QRZVntxdIHyDhtItZVoirFS1GTrsOKRw9JaCcClmD5UsSgcJ6lYiq2pH4qJ6BIx5O_w-ET-QxJEvd2QDwQKm9WyQpa3JrP_BCwFhSKTtj5wZfcECZvpyEQxTYZRNX6hBAb07mumcaZSKd8j1CzLMqTrO-CREeNp_F7T9oL2WtVVNkYkuy_fDjvAxMTsnLMXz1B9VR4z5YGcbe9hJKTYAh5z0JfGKHzT4RuG5UcrsKXy_caEpzb4ZQiG_VRD39kF8iw_uB9wT47IbEwfEp2FnZAYJt1hhrF8QX3ezUiuqzHbLHfaapHViu_UmS226ObLGgvv1uTHdeggWvYNgsTlHfIagmozTtB9lcS8oqhlBw4r18dGQS_3aLbyDFU7xz8ascC3mFx2HPkHEfgLRT1JxhHeO-T2EH8ChlgvhI4MwV84E8cZh6-0463BCaMBtC5RfmGN7C3HHxhuF_T3Mex2FOOmlhxNj7PbuIoIf8esDNscHqD-HXpzccS8i8typDj0gcsDed4dNhrfKYepXY7JGwXd-b0i1wjZHtQG2_RiX7c5GAL5gn8QBiA4Rn2YRz2KRd38nmzNXMPkNzu5fHYxK7ISi-fs_wZ-qL9tc2sYB5o3K6RpkT0XjqdFk3SviyTiujhvUx5l2M9_4QVZtdTpA_M4-WzoYSEKo0UuiTPp8Mos-lcYrNdS_pIb6jRrqX1RD_X-qAeD6RJWfCmbXLJht_tMKTz6scPw7Kvx0dovLralOp_H7N5x7oK7w7Hs9Xjtgb1o0xQOLjrKK22c6Fs5OwfO7zutz0B833WHfxxiJ9WQ-n01YtFzFo2wdZ6t4xUdOOiXWyLbn-Du6RtZh8_n99Hg3_fDySVbvXlhHrVXr0rnaa4nd4VOglzYZp6bCH6TU_osE-49IqU3htaMVDb7MlmwyG5Xr1XSWR9mKR9MVW0Qx_ljFbCkYW8wWjCWTkeKJUA0lhkxpsQfvgvr-bDOSaxYxNomiRTSLp7N4nEyzeJYk8Xy1mDI-WwXTSOBBpcaEY2xsMbJrDylpiwYnlWxcc5rEu6YstBB9uENrwy4PH423rjR2PYyOPJK1z-RfKtrclw">