[cfe-dev] Making MSAN Easier to Use: Providing a Sanitized Libc++

Eric Fiselier via cfe-dev cfe-dev at lists.llvm.org
Sun Aug 14 19:14:53 PDT 2016


>  As a practical matter, I can't set $PLATFORM and/or $LIB in my rpath and
have ld.so do the right thing in this context.

Can't Clang compile the sanitized executable with a special RPATH pointing
to the correct libc++ folder?

> Moreover, it is really a property of how you compiled, so I think using
an alternate library name is natural.

Using an alternatively library names will likely cause problems if a
non-sanitized libc++ is also present, since both libraries
provide the exact same symbols it's possible that symbols in the
non-sanitized libc++ will replace the sanitized versions.




On Sun, Aug 14, 2016 at 7:31 PM, Hal Finkel <hfinkel at anl.gov> wrote:

> ----- Original Message -----
> > From: "Jonathan Roelofs via cfe-dev" <cfe-dev at lists.llvm.org>
> > To: "Eric Fiselier" <eric at efcs.ca>, "clang developer list" <
> cfe-dev at lists.llvm.org>, "Chandler Carruth"
> > <chandlerc at gmail.com>, "Kostya Serebryany" <kcc at google.com>, "Evgenii
> Stepanov" <eugenis at google.com>
> > Sent: Sunday, August 14, 2016 7:07:00 PM
> > Subject: Re: [cfe-dev] Making MSAN Easier to Use: Providing a Sanitized
>      Libc++
> >
> >
> >
> > On 8/14/16 4:05 PM, Eric Fiselier via cfe-dev wrote:
> > > Sanitizers such as MSAN require the entire program to be
> > > instrumented,
> > > anything less leads to plenty of false positives. Unfortunately
> > > this can
> > > be difficult to achieve, especially for the C and C++ standard
> > > libraries. To work around this the sanitizers provide interceptors
> > > for
> > > common C functions, but the same solution doesn't work as well for
> > > the
> > > C++ STL. Instead users are forced to manually build and link a
> > > custom
> > > sanitized libc++. This is a huge PITA and I would like to improve
> > > the
> > > situation, not just for MSAN but all sanitizers. I'm working on a
> > > proposal to change this. The basis of my proposal is:
> > >
> > > Clang should install/provide multiple sanitized versions of Libc++
> > > and a
> > > mechanism to easily link them, as if they were a Compiler-RT
> > > runtime.
> > >
> > > The goal of this proposal is:
> > >
> > > (1) Greatly reduce the number of false positives caused by using an
> > > un-sanitized STL.
> > > (2) Allow sanitizers to catch user bugs that occur within the STL
> > > library, not just its headers.
> > >
> > > The basic steps I would like to take to achieve this are:
> > >
> > > (1) Teach the compiler-rt CMake how to build and install each
> > > sanitized
> > > libc++ version along side its other runtimes.
> > > (2) Add options to the Clang driver to support linking/using these
> > > libraries.
> > >
> > > I think this proposal is likely to be contentious, so I would like
> > > to
> > > focus on the details it. Once I have some feedback on these details
> > > I'll
> > > put together a formal proposal, including a plan for implementing
> > > it.
> > > The details I would like input on are:
> > >
> > > (A) What kind and how many sanitized versions of libc++ should we
> > > provide?
> > > ------------------------------------------------------------
> ---------------------------------------------------
> > >
> > > I think the minimum set would be Address (which includes Leak),
> > > Memory
> > > (With origin tracking?), Thread, and Undefined.
> > > Once we get into combinations of sanitizers things get more
> > > complicated.
> > > What other sanitizer combinations should we provide?
> > >
> > > (B) How should we handle UBSAN?
> > > ---------------------------------------------------
> > >
> > > UBSAN is really just a collection of sanitizers and providing
> > > sanitized
> > > versions of libc++ for every possible configuration is out of the
> > > question.
> > > Instead we should figure out what subset of UBSAN checks we want to
> > > enable in sanitized libc++ versions. I suspect we want to disable
> > > the
> > > following checks.
> > >
> > > * -fsanitize=vptr
> > > * -fsanitize=function
> > > * -fsanitize=float-divide-by-zero
> > >
> > > Additionally UBSAN can be combined with every other sanitizer group
> > > (ie
> > > Address, Memory, Thread).
> > > Do we want to provide a combination of UBSAN on/off for every
> > > group, or
> > > can we simply provide an over-sanitized version with UBSAN on?
> > >
> > > (C) How should the Clang driver expose the sanitized libraries to
> > > the users?
> > > ------------------------------------------------------------
> -------------------------------------------------
> > >
> > > I would like to propose the driver option '-fsanitize-stdlib' and
> > > '-fsanitize-stdlib=<sanitizer>'.
> > > The first version deduces the best sanitized version to use, the
> > > second
> > > allows it to be explicitly specified.
> > >
> > > A couple of other options are:
> > >
> > > * -fsanitize=foo:  Implicitly turn on a sanitized STL. Clang
> > > deduces
> > > which version.
> > > * -stdlib=libc++-<sanitizer>: Explicitly turn on and choose a
> > > sanitized STL.
> > >
> > > (D) Should sanitized libc++ versions override libc++.so?
> > > ------------------------------------------------------------
> -------------------------------
> > >
> > > For example, what happens when a program links to both a sanitized
> > > and
> > > non-sanitized libc++ version?
> > > Does the sanitized version replace the non-sanitized version, or
> > > should
> > > both versions be loaded into the program?
> > >
> > > Essentially I'm asking if the sanitized versions of libc++ should
> > > have
> > > the "soname" libc++ so they can
> > > replace non-sanitized version, or if they should have a different
> > > "soname" so the linker treats them as a separate library.
> > >
> > > I haven't looked into the consequences of either approach in depth,
> > > but
> > > any input is appreciated.
> >
> > In a sense, these are /just/ multilibs, so my inclination would be to
> > make all the soname's the same, and just stick them in appropriately
> > named subfolders relative to their normal location.
>
> I'm not sure that's true; there's no property of the environment that
> determines which library path you need. As a practical matter, I can't set
> $PLATFORM and/or $LIB in my rpath and have ld.so do the right thing in this
> context. Moreover, it is really a property of how you compiled, so I think
> using an alternate library name is natural.
>
>  -Hal
>
> >
> >
> > Jon
> >
> > >
> > > Conclusion
> > > -----------------
> > >
> > > I hope my proposal and questions have made sense. Any and all input
> > > is
> > > appreciated.
> > > Please let me know if anything needs clarification.
> > >
> > > /Eric
> > >
> > >
> > >
> > >
> > >
> > >
> > > _______________________________________________
> > > cfe-dev mailing list
> > > cfe-dev at lists.llvm.org
> > > http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
> > >
> >
> > --
> > Jon Roelofs
> > jonathan at codesourcery.com
> > CodeSourcery / Mentor Embedded
> > _______________________________________________
> > cfe-dev mailing list
> > cfe-dev at lists.llvm.org
> > http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
> >
>
> --
> Hal Finkel
> Assistant Computational Scientist
> Leadership Computing Facility
> Argonne National Laboratory
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/cfe-dev/attachments/20160814/fb326472/attachment.html>


More information about the cfe-dev mailing list