[PATCH] D18895: [sanitizer] [SystemZ] Fix stack traces.

Evgeniy Stepanov via llvm-commits llvm-commits at lists.llvm.org
Fri Apr 29 11:26:29 PDT 2016


eugenis added a comment.

In http://reviews.llvm.org/D18895#416641, @koriakin wrote:

> In http://reviews.llvm.org/D18895#401766, @koriakin wrote:
>
> > In http://reviews.llvm.org/D18895#401765, @uweigand wrote:
> >
> > > In http://reviews.llvm.org/D18895#401756, @koriakin wrote:
> > >
> > > > @uweigand I have committed this as-is, but what do you think about the backchain issue? Should we prioritize adding -mbackchain support before ASan, or just XFAIL these tests?
> > >
> > >
> > > Just adding -mbackchain isn't going to help much, since on all current SystemZ Linux distros the whole system -including all libraries- is built without backchain, so you'll always run into code without backchain.
> >
> >
> > We can make -fsanitize=whatever force -mbackchain though (it's already done for -fno-omit-frame-pointer), which makes the main app traceable, plus add the flag to sanitizer build, which is all that really matters.  And mixing non-sanitized libraries is quite sketchy for MSan and TSan either way...
> >
> > > Generally, the only safe way to unwind the stack is via DWARF CFI; we always default to -fasynchronous-unwind-tables, so .eh_frame should always have good CFI.
> >
> >
> > That is true, and it works in general - but IIRC one of the failing tests checks unwinding through a library that's already unloaded (it prints the stack trace of whatever allocated the problematic memory area, I forgot how it works exactly...).  I'll take a closer look at why the fast unwinder is used in the other tests (this patch fixed 10 or so of them).
>
>
> @uweigand I have debugged the issue - the failing tests involved
>
> In http://reviews.llvm.org/D18895#401766, @koriakin wrote:
>
> > In http://reviews.llvm.org/D18895#401765, @uweigand wrote:
> >
> > > In http://reviews.llvm.org/D18895#401756, @koriakin wrote:
> > >
> > > > @uweigand I have committed this as-is, but what do you think about the backchain issue? Should we prioritize adding -mbackchain support before ASan, or just XFAIL these tests?
> > >
> > >
> > > Just adding -mbackchain isn't going to help much, since on all current SystemZ Linux distros the whole system -including all libraries- is built without backchain, so you'll always run into code without backchain.
> >
> >
> > We can make -fsanitize=whatever force -mbackchain though (it's already done for -fno-omit-frame-pointer), which makes the main app traceable, plus add the flag to sanitizer build, which is all that really matters.  And mixing non-sanitized libraries is quite sketchy for MSan and TSan either way...
> >
> > > Generally, the only safe way to unwind the stack is via DWARF CFI; we always default to -fasynchronous-unwind-tables, so .eh_frame should always have good CFI.
> >
> >
> > That is true, and it works in general - but IIRC one of the failing tests checks unwinding through a library that's already unloaded (it prints the stack trace of whatever allocated the problematic memory area, I forgot how it works exactly...).  I'll take a closer look at why the fast unwinder is used in the other tests (this patch fixed 10 or so of them).
>
>
> @uweigand The issue here is that DWARF CFI is *slow*.  Since ASan does a stack trace on every malloc and free (to print it if someone oversteps the buffer later), it uses a "fast" unwinder for those by default, which naïvely walks the stack using fixed-location frame pointers.  So we have three options here:
>
> 1. Always use the DWARF unwinder (causes a HUGE perf hit in the ASan testsuite - it hasn't finished yet, but it seems to be about 10×),
> 2. Implement -mbackchain and make -fsanitize=* imply it,
> 3. Make peace with 2-element backtraces and XFAIL the tests.


We have the exact same situation on linux/x86_64: system libraries are built w/o frame pointers. This is not a problem in practice at all - very few allocations that matter are done from system libraries. Simple cases like strdup are handled by intercepting the function and unwinding through the implementation in ASan runtime library instead. In the super rare cases where it matters, we have a runtime flag to turn on the dwarf unwind.

Dwarf unwind is too slow for malloc; it slows down malloc-intensive programs a lot (like 10x or more). This is a problem with real-world code, too, not just ASan test suite.

I vote for (2), but the behaviour should be the same as -fno-omit-frame-pointer which is currently not enabled automatically by ASan.


Repository:
  rL LLVM

http://reviews.llvm.org/D18895





More information about the llvm-commits mailing list