[cfe-dev] Making MSAN Easier to Use: Providing a Sanitized Libc++

Eric Fiselier via cfe-dev cfe-dev at lists.llvm.org
Sun Aug 14 15:05:57 PDT 2016

Sanitizers such as MSAN require the entire program to be instrumented,
anything less leads to plenty of false positives. Unfortunately this can be
difficult to achieve, especially for the C and C++ standard libraries. To
work around this the sanitizers provide interceptors for common C
functions, but the same solution doesn't work as well for the C++ STL.
Instead users are forced to manually build and link a custom sanitized
libc++. This is a huge PITA and I would like to improve the situation, not
just for MSAN but all sanitizers. I'm working on a proposal to change this.
The basis of my proposal is:

Clang should install/provide multiple sanitized versions of Libc++ and a
mechanism to easily link them, as if they were a Compiler-RT runtime.

The goal of this proposal is:

(1) Greatly reduce the number of false positives caused by using an
un-sanitized STL.
(2) Allow sanitizers to catch user bugs that occur within the STL library,
not just its headers.

The basic steps I would like to take to achieve this are:

(1) Teach the compiler-rt CMake how to build and install each sanitized
libc++ version along side its other runtimes.
(2) Add options to the Clang driver to support linking/using these

I think this proposal is likely to be contentious, so I would like to focus
on the details it. Once I have some feedback on these details I'll put
together a formal proposal, including a plan for implementing it.
The details I would like input on are:

(A) What kind and how many sanitized versions of libc++ should we provide?

I think the minimum set would be Address (which includes Leak), Memory
(With origin tracking?), Thread, and Undefined.
Once we get into combinations of sanitizers things get more complicated.
What other sanitizer combinations should we provide?

(B) How should we handle UBSAN?

UBSAN is really just a collection of sanitizers and providing sanitized
versions of libc++ for every possible configuration is out of the question.
Instead we should figure out what subset of UBSAN checks we want to enable
in sanitized libc++ versions. I suspect we want to disable the following

* -fsanitize=vptr
* -fsanitize=function
* -fsanitize=float-divide-by-zero

Additionally UBSAN can be combined with every other sanitizer group (ie
Address, Memory, Thread).
Do we want to provide a combination of UBSAN on/off for every group, or can
we simply provide an over-sanitized version with UBSAN on?

(C) How should the Clang driver expose the sanitized libraries to the users?

I would like to propose the driver option '-fsanitize-stdlib' and
The first version deduces the best sanitized version to use, the second
allows it to be explicitly specified.

A couple of other options are:

* -fsanitize=foo:  Implicitly turn on a sanitized STL. Clang deduces which
* -stdlib=libc++-<sanitizer>: Explicitly turn on and choose a sanitized STL.

(D) Should sanitized libc++ versions override libc++.so?

For example, what happens when a program links to both a sanitized and
non-sanitized libc++ version?
Does the sanitized version replace the non-sanitized version, or should
both versions be loaded into the program?

Essentially I'm asking if the sanitized versions of libc++ should have the
"soname" libc++ so they can
replace non-sanitized version, or if they should have a different "soname"
so the linker treats them as a separate library.

I haven't looked into the consequences of either approach in depth, but any
input is appreciated.


I hope my proposal and questions have made sense. Any and all input is
Please let me know if anything needs clarification.

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/cfe-dev/attachments/20160814/2e60a21f/attachment.html>

More information about the cfe-dev mailing list