[PATCH] D66979: [InstrProf] Tighten a check for malformed data records in raw profiles

Pete Cooper via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Tue Sep 3 09:20:54 PDT 2019


pete added inline comments.


================
Comment at: llvm/lib/ProfileData/InstrProfReader.cpp:419
 
-  // Check bounds.
-  if (RawCounts.data() < CountersStart ||
-      RawCounts.data() + RawCounts.size() > NamesStartAsCounter)
+  // Check bounds. Note that the counter pointer embedded in the data record
+  // may itself be corrupt.
----------------
vsk wrote:
> vsk wrote:
> > davidxl wrote:
> > > how does the corruption happen?
> > Not yet sure, this needs more investigation.
> > 
> > The corruption happens on the WIP lldb code coverage bot [1]. The binary is debugserver. The failure reason is that a CounterPtr field within a InstrProfData record is incorrect. I am not sure how a merge failure could cause this, because the on-line profile merging does not mutate the CounterPtr field. It's possible that debugserver was miscompiled.
> > 
> > My debugging notes:
> > 
> > - Symtab contains _ZN17DNBBreakpointList31FindBreakpointsThatOverlapRange, must be debugserver
> > - std::__1::allocator<char>::allocator() has the bad counter pointer
> > - (lldb) p RawCounts.data()
> > - (const unsigned long long *) $1 = 0x00000001022b14e8
> > - (lldb) p NumCounters
> > - (uint32_t) $2 = 1
> > - (lldb) p NamesStartAsCounter
> > - (const unsigned long long *) $3 = 0x0000000100e693d0
> > - TestLldbGdbServer.py times out, then the test harness kills debugserver. Possible cause?
> > 
> > [1] http://lab.llvm.org:8080/green/view/Experimental/job/coverage/20/consoleText
> + @pete for dyld expertise -- for context, this is about malformed profile data seen when generating code coverage, and I suspect dyld might play a role.
> 
> Some more debugging notes:
> 
> I modified the raw profile reader to create an InstrProfRecord with raw-counts = {999999, 999999} when it finds a malformed CounterPtr. Feeding in the malformed profile, I see that the first 162 records are well-formed, but that the last 6662 records have an incorrect CounterPtr field. Here is the last valid record and the first invalid record:
> 
> ```
>   std::__1::__compressed_pair_elem<std::__1::allocator<char>, 1, true>::__compressed_pair_elem():
>     Hash: 0x0000000000000000
>     Counters: 1
>     Function count: 95500
>     Block counts: []
>   std::__1::allocator<char>::allocator():
>     Hash: 0x0000000000000000
>     Counters: 2
>     Function count: 999999
>     Block counts: [999999]
> ```
> 
> Substituting fake '999999' counts for the raw counts in malformed records is enough to allow the profile to be parsed successfully. Note that all of the names are available, which means that the names section (which follows the data section) was written correctly. Further, printing out the computed CounterOffset field for malformed records, I see that they are not random, but instead look monotonic:
> 
> ```
> CounterOffset: 28c286  NumCounters: 1
> CounterOffset: 28c287  NumCounters: 3
> CounterOffset: 28c28a  NumCounters: 1
> CounterOffset: 28c28b  NumCounters: 1
> CounterOffset: 28c28c  NumCounters: 4
> ```
> 
> I cannot reproduce this issue in local testing with debugserver. Based on all of this, a miscompile seems unlikely, as does a write failure.
> 
> It looks as if the CounterPtr fields within the __DATA segment were all rebased by the dynamic loader (dyld). On Darwin, it appears that the data in __llvm_prf_cnts is emitted linkonce_odr. Perhaps the debugserver process dlopen()'d an instrumented library which contained conflicting definitions of some functions: dyld would then have to pick one out of multiple candidate definitions, and then rewrite pointers to the old definitions. @pete does that sound plausible?
It's possible yeah.  I don't remember a relationship between link once ODR and weak definition coalescing, but the two could be used together if you wanted.

You can see whether these symbols would be coalesced by dyld if they are "weak binds" with (for example on libc++.1.dylib)

llvm-objdump -macho -weak-bind /usr/lib/libc++.1.dylib


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D66979/new/

https://reviews.llvm.org/D66979





More information about the llvm-commits mailing list