[cfe-dev] Making MSAN Easier to Use: Providing a Sanitized Libc++

Jonathan Roelofs via cfe-dev cfe-dev at lists.llvm.org
Sun Aug 14 17:07:00 PDT 2016

On 8/14/16 4:05 PM, Eric Fiselier via cfe-dev wrote:
> Sanitizers such as MSAN require the entire program to be instrumented,
> anything less leads to plenty of false positives. Unfortunately this can
> be difficult to achieve, especially for the C and C++ standard
> libraries. To work around this the sanitizers provide interceptors for
> common C functions, but the same solution doesn't work as well for the
> C++ STL. Instead users are forced to manually build and link a custom
> sanitized libc++. This is a huge PITA and I would like to improve the
> situation, not just for MSAN but all sanitizers. I'm working on a
> proposal to change this. The basis of my proposal is:
> Clang should install/provide multiple sanitized versions of Libc++ and a
> mechanism to easily link them, as if they were a Compiler-RT runtime.
> The goal of this proposal is:
> (1) Greatly reduce the number of false positives caused by using an
> un-sanitized STL.
> (2) Allow sanitizers to catch user bugs that occur within the STL
> library, not just its headers.
> The basic steps I would like to take to achieve this are:
> (1) Teach the compiler-rt CMake how to build and install each sanitized
> libc++ version along side its other runtimes.
> (2) Add options to the Clang driver to support linking/using these
> libraries.
> I think this proposal is likely to be contentious, so I would like to
> focus on the details it. Once I have some feedback on these details I'll
> put together a formal proposal, including a plan for implementing it.
> The details I would like input on are:
> (A) What kind and how many sanitized versions of libc++ should we provide?
> ---------------------------------------------------------------------------------------------------------------
> I think the minimum set would be Address (which includes Leak), Memory
> (With origin tracking?), Thread, and Undefined.
> Once we get into combinations of sanitizers things get more complicated.
> What other sanitizer combinations should we provide?
> (B) How should we handle UBSAN?
> ---------------------------------------------------
> UBSAN is really just a collection of sanitizers and providing sanitized
> versions of libc++ for every possible configuration is out of the question.
> Instead we should figure out what subset of UBSAN checks we want to
> enable in sanitized libc++ versions. I suspect we want to disable the
> following checks.
> * -fsanitize=vptr
> * -fsanitize=function
> * -fsanitize=float-divide-by-zero
> Additionally UBSAN can be combined with every other sanitizer group (ie
> Address, Memory, Thread).
> Do we want to provide a combination of UBSAN on/off for every group, or
> can we simply provide an over-sanitized version with UBSAN on?
> (C) How should the Clang driver expose the sanitized libraries to the users?
> -------------------------------------------------------------------------------------------------------------
> I would like to propose the driver option '-fsanitize-stdlib' and
> '-fsanitize-stdlib=<sanitizer>'.
> The first version deduces the best sanitized version to use, the second
> allows it to be explicitly specified.
> A couple of other options are:
> * -fsanitize=foo:  Implicitly turn on a sanitized STL. Clang deduces
> which version.
> * -stdlib=libc++-<sanitizer>: Explicitly turn on and choose a sanitized STL.
> (D) Should sanitized libc++ versions override libc++.so?
> -------------------------------------------------------------------------------------------
> For example, what happens when a program links to both a sanitized and
> non-sanitized libc++ version?
> Does the sanitized version replace the non-sanitized version, or should
> both versions be loaded into the program?
> Essentially I'm asking if the sanitized versions of libc++ should have
> the "soname" libc++ so they can
> replace non-sanitized version, or if they should have a different
> "soname" so the linker treats them as a separate library.
> I haven't looked into the consequences of either approach in depth, but
> any input is appreciated.

In a sense, these are /just/ multilibs, so my inclination would be to 
make all the soname's the same, and just stick them in appropriately 
named subfolders relative to their normal location.


> Conclusion
> -----------------
> I hope my proposal and questions have made sense. Any and all input is
> appreciated.
> Please let me know if anything needs clarification.
> /Eric
> _______________________________________________
> cfe-dev mailing list
> cfe-dev at lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev

Jon Roelofs
jonathan at codesourcery.com
CodeSourcery / Mentor Embedded

More information about the cfe-dev mailing list