[cfe-dev] Making MSAN Easier to Use: Providing a Sanitized Libc++

Hal Finkel via cfe-dev cfe-dev at lists.llvm.org
Sun Aug 14 15:41:37 PDT 2016


----- Original Message -----

> From: "Eric Fiselier via cfe-dev" <cfe-dev at lists.llvm.org>
> To: "clang developer list" <cfe-dev at lists.llvm.org>, "Chandler
> Carruth" <chandlerc at gmail.com>, "Kostya Serebryany"
> <kcc at google.com>, "Evgenii Stepanov" <eugenis at google.com>
> Sent: Sunday, August 14, 2016 5:05:57 PM
> Subject: [cfe-dev] Making MSAN Easier to Use: Providing a Sanitized
> Libc++

> Sanitizers such as MSAN require the entire program to be
> instrumented, anything less leads to plenty of false positives.
> Unfortunately this can be difficult to achieve, especially for the C
> and C++ standard libraries. To work around this the sanitizers
> provide interceptors for common C functions, but the same solution
> doesn't work as well for the C++ STL. Instead users are forced to
> manually build and link a custom sanitized libc++. This is a huge
> PITA and I would like to improve the situation, not just for MSAN
> but all sanitizers.
I've not thought deeply about the deployment model here, but this is certainly an important problem. Thanks for working on this. We need to figure out a way of automatically providing users with a sanitized STL in a straightforward manner. I'd prefer that they automatically get the appropriately-instrumented runtime, by default, just by providing the -fsanitize=... flag. The same issue comes up for other runtimes, such as the OpenMP runtime library. 

-Hal 

> I'm working on a proposal to change this. The basis of my proposal
> is:

> Clang should install/provide multiple sanitized versions of Libc++
> and a mechanism to easily link them, as if they were a Compiler-RT
> runtime.

> The goal of this proposal is:

> (1) Greatly reduce the number of false positives caused by using an
> un-sanitized STL.
> (2) Allow sanitizers to catch user bugs that occur within the STL
> library, not just its headers.

> The basic steps I would like to take to achieve this are:

> (1) Teach the compiler-rt CMake how to build and install each
> sanitized libc++ version along side its other runtimes.
> (2) Add options to the Clang driver to support linking/using these
> libraries.

> I think this proposal is likely to be contentious, so I would like to
> focus on the details it. Once I have some feedback on these details
> I'll put together a formal proposal, including a plan for
> implementing it.
> The details I would like input on are:

> (A) What kind and how many sanitized versions of libc++ should we
> provide?
> ---------------------------------------------------------------------------------------------------------------

> I think the minimum set would be Address (which includes Leak),
> Memory (With origin tracking?), Thread, and Undefined.
> Once we get into combinations of sanitizers things get more
> complicated. What other sanitizer combinations should we provide?

> (B) How should we handle UBSAN?
> ---------------------------------------------------

> UBSAN is really just a collection of sanitizers and providing
> sanitized versions of libc++ for every possible configuration is out
> of the question.
> Instead we should figure out what subset of UBSAN checks we want to
> enable in sanitized libc++ versions. I suspect we want to disable
> the following checks.

> * -fsanitize=vptr
> * -fsanitize=function
> * -fsanitize=float-divide-by-zero

> Additionally UBSAN can be combined with every other sanitizer group
> (ie Address, Memory, Thread).
> Do we want to provide a combination of UBSAN on/off for every group,
> or can we simply provide an over-sanitized version with UBSAN on?

> (C) How should the Clang driver expose the sanitized libraries to the
> users?
> -------------------------------------------------------------------------------------------------------------

> I would like to propose the driver option '-fsanitize-stdlib' and
> '-fsanitize-stdlib=<sanitizer>'.
> The first version deduces the best sanitized version to use, the
> second allows it to be explicitly specified.

> A couple of other options are:

> * -fsanitize=foo: Implicitly turn on a sanitized STL. Clang deduces
> which version.
> * -stdlib=libc++-<sanitizer>: Explicitly turn on and choose a
> sanitized STL.

> (D) Should sanitized libc++ versions override libc++.so?
> -------------------------------------------------------------------------------------------

> For example, what happens when a program links to both a sanitized
> and non-sanitized libc++ version?
> Does the sanitized version replace the non-sanitized version, or
> should both versions be loaded into the program?

> Essentially I'm asking if the sanitized versions of libc++ should
> have the "soname" libc++ so they can
> replace non-sanitized version, or if they should have a different
> "soname" so the linker treats them as a separate library.

> I haven't looked into the consequences of either approach in depth,
> but any input is appreciated.

> Conclusion
> -----------------

> I hope my proposal and questions have made sense. Any and all input
> is appreciated.
> Please let me know if anything needs clarification.

> /Eric

> _______________________________________________
> cfe-dev mailing list
> cfe-dev at lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev

-- 

Hal Finkel 
Assistant Computational Scientist 
Leadership Computing Facility 
Argonne National Laboratory 
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/cfe-dev/attachments/20160814/3ecf0ef5/attachment.html>


More information about the cfe-dev mailing list