[PATCH] [compiler-rt] bug 23600 - sanitizer stack trace pc off by 1

Fri Jun 5 17:05:37 PDT 2015

================
Comment at: lib/sanitizer_common/sanitizer_stacktrace.h:84
@@ -78,3 +83,3 @@
   // Cancel Thumb bit.
-  pc = pc & (~1);
+  return pc & (~1);
 #endif
----------------
kcc wrote:
> samsonov wrote:
> > msebor wrote:
> > > kcc wrote:
> > > > I think the old code was intentional. 
> > > > It cancels the thumb bit (whatever that is) and then executes the rest of the function.
> > > > Did you test it on arm? 
> > > This change isn't necessary for the fix, but I included it because it was trivial and obvious (the function otherwise is missing a return statement), and because I didn't manage to create a test case that made the bug manifest in an observable way.  I tested the same change to the GCC sanitizer on arm with no regressions.
> > I also think the original code was written like that intentionally - we're first canceling the thumb bit, and then subtract one. I agree that it's probably should be rewritten as
> >   return (pc & (~1)) - 1;
> >> the function otherwise is missing a return statement
> No it is not. There is an #else clause. 
> 
> It's very sad that none of our tests cover this use case (although you can't be sure w/o running clang tests on arm),
> but it is not a good reason to change the logic of the code. 
You're right. I completely missed there are two sets of #if/#endf blocks. My bad! I'll remove the change.

================
Comment at: lib/sanitizer_common/sanitizer_stacktrace_libcdep.cc:37
@@ -33,1 +36,3 @@
+    const int adjust = (0 == i) && signaled ? 0 : 1;
+    const uptr pc = trace[i] - adjust;
     SymbolizedStack *frames = Symbolizer::GetOrInit()->SymbolizePC(pc);
----------------
samsonov wrote:
> msebor wrote:
> > kcc wrote:
> > > I frankly don't understand this. 
> > > The old code was to call GetPreviousInstructionPc(), which on some archs subtracted 4 or 8.
> > > Now you just subtract 1. Why? 
> > DWARF line number programs are represented as ranges of addresses that correspond to a given source line.  When looking up a line number for a given PC value, the lookup compares the PC value to the beginning and end address of each range and considers it a match when the PC is anywhere in that range.  It doesn't matter if the PC value points to the beginning of an instruction in the range or some byte past it.  So subtracting 1 byte is just as good as subtracting the size of an instruction.
> We've added architecture-specific `GetPreviousInstructionPc` for a reason: we don't care at all about actual instruction sizes and addresses, we just need PC's that would be correctly handled by symbolizer, and subtracting one provided to be insufficient. As, for the line programs, suppose I have two consecutive instructions:
> <instruction 1> at PC [0x100, 0x102], source line 42.
> <instruction 2> at PC [0x104, 0x105], source line 43.
> 
> Note that instructions are 4-byte aligned. I can imagine the line table for this program matching [0x100,0x102] to line 42, and [0x104,0x105] to line 43. The lookup of 0x103 would not find any line table entry, and thus return no line number at all.
> 
> 
I don't think the scenario you described can happen. After a branch/call, the PC points at the first byte of the instruction following the branch instruction (or following the delay slot). An address that's one less than that is that of the last byte of the branch instruction (or the instruction in the delay slot). There can't very well be a gap between the branch instruction and what follows it in the line number program, but if there was, it has to be dealt with by mapping addresses in the gap to the preceding address range. That's just what libbacktrace does: it stores the line number program as an array of triples (PC, file, line) sorted by PC. When looking up a line number for an address, it calls the bsearch function on the array with a callback that compares the address against the range denoted by the PC of current element and the PC of the element just past it (there's a sentinel element at the end of the array).  I.e., no gaps are possible.

(As an aside, subtracting 1 on RISC is no different than doing the same on x86 where the call instruction is at least 2 or more bytes wide.)

All that being said, I don't think this part of the patch is essential (so far none of the tests failed after replacing the subtraction by 1 with GetPreviousInstructionPc). I'll do some more testing and if there are no regressions I'll submit a new patch with GetPreviousInstructionPc instead.

http://reviews.llvm.org/D10065

EMAIL PREFERENCES
  http://reviews.llvm.org/settings/panel/emailpreferences/