[llvm-commits] PATCH: Inserting LLVM code inside compiler-rt libraries

Tue Aug 7 10:21:15 PDT 2012

On Aug 6, 2012, at 11:22 PM, Alexey Samsonov wrote:

> 
> 
> On Mon, Aug 6, 2012 at 10:35 PM, Anna Zaks <ganna at apple.com> wrote:
> 
> On Aug 6, 2012, at 10:46 AM, Chandler Carruth wrote:
> 
>> On Mon, Aug 6, 2012 at 10:29 AM, Anna Zaks <ganna at apple.com> wrote:
>> 
>> On Aug 2, 2012, at 11:09 AM, Chandler Carruth wrote:
>> 
>>> On Thu, Aug 2, 2012 at 10:59 AM, Alexey Samsonov <samsonov at google.com> wrote:
>>> Hi, llvm-commits
>>> 
>>> Here: http://codereview.appspot.com/6458066/ is the short experimental FYI-patch that allows LLVM sources to be "compiled into" static compiler-rt libraries in CMake build system.
>> 
>> Any plans on updating the autoconf+make build system as well?
>> 
>> I cannot even read the makefiles of compiler-rt. I don't want to touch them, and have encouraged others not to touch them as a consequence. I think you'll need to talk to ddunbar, last time I complained about the poor readability and maintainability of that make system he told me he would just implement whatever we needed....
>>  
>> 
>>> (I'm not sure that is smart enough to capture all dependencies, though).
>>> With something that simple we can:
>>> 1) directly use LLVM code from compiler-rt libraries.
>>> 2) workaround the unavailable compilation of llvm libraries for several targets: each static compiler-rt lib will contain its own private copy of LLVM libs, compiled for necessary target and with necessary compile flags.
>>> 
>>> I've not looked at the patch yet, but some initial points... I'm a bit sad to start *another* thread, much of this was discussed in my RFC thread from some time ago, but here is a re-cap:
>>>  
>>> And there are multiple drawbacks:
>>> 1) License issues (LLVM code has binary redistribution clause, right? So everything built with "clang -faddress-sanitizer" would attribute LLVM license, gr-r-r).
>>> 
>>> We are looking into fixing this, I'm fairly confident that in one form or another, this will largely be a temporary issue. Let's not discuss that to death here.
>>>  
>>> 2) Static ASan runtime is now 10x larger (2,5M vs 250K), while most of its functionality (various stuff from LLVMSupport) is not needed.
>>> 
>>> This is simply poor structuring of the library, or misbehavior by the linker. We should figure out what's causing it and fix this.
>>>  
>>> 3) Symbol name clashes - suppose one want to build something with ASan and link against "normal" version of LLVMSupport. (can compiling the code with -fvisibility=hidden, as we currently do, help with this?)
>>> 
>>> -fvisibility=hidden won't help at all.
>>> 
>>> This is what my RFC was about, specifically solving this problem. Perhaps we should actually go that route? ;] It keeps coming up, and there is a fairly direct solution that is a "small matter of code" to achieve.
>>> 
>>> Does this direction look promising to you?
>>> 
>>> Yes.
>>>  
>>> Maybe, we should turn to using DSO instead?
>>> 
>>> We could in theory, but I was under the distinct impression that DSO-s were a non-starter for several different reasons:
>>> 
>>> 1) Introduces complex rpath requirements into binaries, making distribution even harder.
>>> 2) Introduces small performance overhead into the runtime library in all cases... Maybe we could live with this though.
>>> 3) Introduces dramatic performance overhead into TLS for the runtime library, a likely deal-breaker for tsan.
>> 
>> Suppose only LLVMSupport is pulled out into a separate DSO. Would the overhead be the same as making all of compiler-rt a separate dynamic library? If we only use it to symbolicate the trace after an error is hit, the performance overhead could be much smaller.
>> 
>> This might work for asan, but my impression is it would not work for tsan. There, we need to filter errors based on the symbolized backtrace, so it won't just be in the error-path.
> 
> I ment: you hit a "possible" error, symbolicate it, and filter out if it's blacklisted. It should be possible (at least in theory) not to require symbolication on every memory access and hopefully the number of times you hit a possible error is much smaller. That said, I did not look at how this is actually implemented in TSan. 
> 
> Yes, TSan needs symbolication only when it finds the actual data race. Still, overhead introduced by it is significant, and AFAIR TSan which uses addr2line symbolicator runs slower than
> TSan w/o any symbolication on large binaries (alas, I can't provide more specific data).

My understanding is that you will have to pay the price for symbolicating the trace in-process regardless, so the performance difference would be between calling a statically linked library vs calling a dynamically linked library.

>  
>> Also, this is a short term solution rather than a long term solution. We will want to share more code further down the road, so I would like to continue to push for a better long-term solution.
>> 
>>  
>> 
>>> 
>>> To elaborate on #1 a bit, I'm not very enthusiastic about a strategy that *precludes* a statically linked binary from using the full runtime library. It also will limit the reach of asan on platforms which don't have a good DSO story or where it would be infeasible to get the DSO in place.... We have very real world use cases that fall into this category.
>>> 
>>> 
>>> The drawbacks of the DSO approach seem fundamental to the technology used. The drawbacks to the static library seem like solvable technical challenges we need to write some code to deal with.
>> 
>> There is an advantage to using a dynamic library for compiler-rt on Darwin. To ensure there is only one copy of asan_init (and the data structures used by it) around, we link the compiler-rt library only into the executable. The current solution does not allow instrumenting a dynamic library (without instrumenting an executable as well). This is a major limitation and complicating the distribution(#1) seems like a reasonable price to pay at least for ASan, where the performance overhead should not be substantial.
>> 
>> Sorry, I should have been more clear.
>> 
>> I very much like preserving the ability to do *both* DSO and static library. It's precluding either strategy that seems bad to me because, as you point out, they both have their time, place, and purpose. As it happens, if we can solve the problem for a statically linked runtime library, I believe building a single DSO for the entire runtime will be straight forward. 
> 
> 
> 
> 
> 
> 
> -- 
> Alexey Samsonov, MSK
> 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20120807/10eca0e7/attachment.html>