[LLVMdev] RFC: How can AddressSanitizer, ThreadSanitizer, and similar runtime libraries leverage shared library code?

Thu Jun 21 01:04:35 PDT 2012

On Thu, Jun 21, 2012 at 11:52 AM, Chandler Carruth <chandlerc at google.com>wrote:
>
> Hi,
>>
>> Yes, stlport was a pain to deploy and maintain + it calls normal operator
>> new/delete (there is no way to put them into a separate namespace).
>>
>
> Ok, but putting the raw symbols into a "namespace" with the linker
> shouldn't be subject to these limitations.
>

OK

>
>  Note that in some codebases we build asan/tsan runtimes from source. How
>> the build process will look with that object file mangling? How easy it is
>> to integrate it into a custom build process?
>>
>
> Well, I don't know yet. ;] It was an idea, I don't have an implementation
> at this point. That said, I had only really imagined building the runtimes
> from source? Maybe I don't understand what you mean by this?
>
> The vague strategy I am imagining for the build proces is this:
>
> 1) compile runtime into a static library, just like any other static
> library
>
> 2) collect all the '.o' files in the static archive, and in any
> dependencies' static archive libraries
>
> 3) for each 'foo.o' build a 'foo_munged.o' using $tool, the _munged
> version has all symbols not on the whitelist for export to the instrumented
> binary
>
> 4) put all of the _munged '.o' files into a single runtime archive
>
>
> The $tool here could be "ld -r" with a linker script, or (likely necessary
> on windows) a very simple, dedicated tool built around the LLVM object
> libraries to copy each symbol, munging the name.
>
>
> Soon I will start integrating tsan into Go language. For the Go language
>> we need very simple object files.
>>
>
> Ok... I'm not sure whether this should really constrain the way we build
> the core runtime system here though. If you need some logic on the tsan
> side factored out into a separate library for use with Go, that would seem
> simpler than trying to make one sanitizer runtime library to support
> frontends, middle ends, and programming languages with totally separate
> models.
>

Yes, it will be a separate runtime library. But if tsan sources are deeply
dependent on llvm sources, this may be significantly harder to do.

No global ctors, no thread-local storage, no weak symbols and other
>> trickery. Basically what a portable C compiler could have produced.
>>
>
> These also don't seem insurmountable, even in the existing use cases. But
> maybe I'm not considering the actual restrictions you are, or I've
> misunderstood. Here is how I'm breaking down the things you've mentioned:
>

>
> 1) It seems reasonable to avoid global constructors, and do-able in C++
> even when using the standard library and parts of LLVM. LLVM itself
> specifically works to avoid them.
>

Is it the case for C++ library that llvm uses?

2) TLS doesn't seem to be required by anything I'm suggesting... is there
> something that worries you about this?
>

I suspect that C/C++ library can use them.

3) I don't understand the requirement to have no weak symbols. Even a
> portable C compiler might produce weak symbols?
>

The linker does not understand them.

> Still, during the re-linking phase above, it should be possible to resolve
> any weak symbols?
>

Well, most likely yes.

There may be additional limitations that I don't know yet.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20120621/efa5262b/attachment.html>