[LLVMdev] RFC: How can AddressSanitizer, ThreadSanitizer, and similar runtime libraries leverage shared library code?

Thu Jun 21 00:21:22 PDT 2012

Hi,

Yes, stlport was a pain to deploy and maintain + it calls normal operator
new/delete (there is no way to put them into a separate namespace).

Note that in some codebases we build asan/tsan runtimes from source. How
the build process will look with that object file mangling? How easy it is
to integrate it into a custom build process?

Soon I will start integrating tsan into Go language. For the Go language we
need very simple object files. No global ctors, no thread-local storage, no
weak symbols and other trickery. Basically what a portable C compiler could
have produced.

On Wed, Jun 20, 2012 at 10:05 AM, Chandler Carruth <chandlerc at google.com>wrote:

> Hello folks (and sorry if I've forgotten to CC anyone with particular
>>>>>> interest to this discussion...):
>>>>>>
>>>>>> I've been thinking a lot about how best to build advanced runtime
>>>>>> libraries like ASan, and scale them up. Note that this does *not* try to
>>>>>> address any licensing issues. For now, I'll consider those orthogonal /
>>>>>> solvable w/o technical contortions. =]
>>>>>>
>>>>>> My primary motivation: we really, *really* need runtime libraries to
>>>>>> be able to use common, shared libraries.
>>>>>>
>>>>>
>>>>> I am not sure you understand the problem as we do.
>>>>>
>>>>> In short, asan/tsan/msan/etc can not use any function which is also
>>>>> called from the instrumented binary.
>>>>>
>>>>
>>>> Well, I can't be sure, but this description certainly agrees with my
>>>> understanding -- you need *every* part of the runtime to be completely
>>>> separate from *every* part of the instrumented binary. I'm with you there.
>>>>
>>>> In particular, I think the current strategy for libc & system calls
>>>> makes perfect sense, and I'm not trying to suggest changing it.
>>>>
>>>> I think the most similar situation is is this one:
>>>>
>>>> In the previous version of ThreadSanitizer we used a private copy of
>>>>> STLport in a separate namespace and a custom libc (small subset).
>>>>>
>>>>
>>>> My proposal is very similar except without the need to modify the C++
>>>> standard library in use. Instead, I'm suggesting post-processing the
>>>> library to ensure that the standard C++ library code in the runtime is kept
>>>> complete distinct from that in the instrumented binary -- everything would
>>>> in fact be *mangled* differently.
>>>>
>>>> The goal would be to avoid the maintenance overhead of a custom C++
>>>> standard library, and instead use a normal one. My understanding is that
>>>> both GCC's libstdc++ and LLVM's libc++ are significantly higher quality
>>>> than STLport, and if we're doing static linking, the code bloat should be
>>>> greatly reduced. We could reduce it still further by doing LTO of the
>>>> runtime library, which should be very straight forward given the rest of my
>>>> proposal.
>>>>
>>>> It would still require a very small subset of libc, likely not much
>>>> more than you already have.
>>>>
>>>>  This worked, but had problems too (Dmitry was very angry at STLport
>>>>> for code bloat, stack size increase and some direct libc calls).
>>>>>
>>>>
>>>> I would be interested to know if the above addresses most of the
>>>> problems or not.
>>>>
>>>>
>>>>>  Until recently this was not causing too much pain in asan/tsan, but
>>>>> our attempts to use the LLVM DWARF readers made it worse.
>>>>> When tsan finds a race, we need to symbolize it online to be able to
>>>>> match against a suppression and decide whether we want to emit the warning.
>>>>> Today we do it in a separate addr2line process (ugly and slow).
>>>>> But if we start calling the LLVM dwarf reader we end up with all
>>>>> possible dependency problems (Dmitry and Alexey will know the exact ones)
>>>>> because the LLVM code calls to malloc, memcpy, etc.
>>>>>
>>>>> Frankly, I don't have any solution other than to change the code such
>>>>> that it does not call libc/libc++.
>>>>> Some of that may be solved by a private copy of STLport + a bit of
>>>>> custom libc (but see above about STLport)
>>>>>
>>>>
>>>> I think my proposal is essentially in between these two:
>>>>
>>>> - Avoid the need for a low quality STL by using a normal C++ standard
>>>> library implementation, and avoid maintenance burden by doing a link-time
>>>> mangling of the symbols.
>>>>
>>>
>>> re-linking might be too platform specific.
>>> How about compiling the library into LLVM bitcode and adding
>>> namespaces/prefixes to that bitcode?
>>>
>>
>> Re-linking is a bit platform specific...
>>
>> It would definitely work on ELF platforms, and likely on Darwin, but
>> Windows is tricky.
>>
>> On windows we would at least need a custom tool, but such a tool would be
>> quite easy to write I suspect. We could even use the very LLVM libraries in
>> question to write it! ;] Amusingly, I think with the LLVM libraries we
>> could very easily write a custom tool just to mangle the symbol names in a
>> collection of object files very easily and have it work on *most* platforms!
>>
>> Still, the bitcode idea is interesting. Doing this entirely in bitcode
>> has some advantages as these types of runtimes are among the best uses for
>> things like LTO: they're small, performance sensitive, can enumerate the
>> entry points easily, and are likely to have a particular need for dead code
>> elimination.
>>
>
> One reason to want to have some support for doing this w/o bitcode: we may
> not have the bitcode. Specifically, the goal would be to use the "normal"
> C++ standard library, provided it is available to link statically
> (libstdc++ and libc++ certainly are, I don't know about MSVC). That would
> be much easier if we can actually use the existing archive file, and just
> "fix" the .o files inside it.
>
> It seems likely to be the equivalent of an 'ld -r' run with a linker
> script to munge the symbol names, or potentially a custom tool written with
> the LLVM object file libraries.
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20120621/40442578/attachment.html>