[llvm-dev] Use of the C++ standard library in XRay compiler-rt
Evgenii Stepanov via llvm-dev
llvm-dev at lists.llvm.org
Wed Mar 8 15:12:06 PST 2017
On Wed, Mar 8, 2017 at 2:32 PM, David Blaikie <dblaikie at gmail.com> wrote:
>
>
> On Wed, Mar 8, 2017 at 2:28 PM Tim Shen <timshen at google.com> wrote:
>>
>> On Wed, Mar 8, 2017 at 1:49 PM David Blaikie <dblaikie at gmail.com> wrote:
>>>
>>> So I stumbled across an issue that I think is a bit fundamental:
>>>
>>> The xray runtime uses the C++ standard library.
>>>
>>> This seems like a problem because whatever C++ standard library is used
>>> to compile the XRay runtime may not be the same as the C++ standard library
>>> (if any) that is used to build the target application and link XRay into.
>>>
>>> Does this make sense? Is this a problem?
>>>
>>> Talking to Chandler over lunch it sounds like there's a couple of options
>>> - either remove the dependency (much like, I believe, the sanitizer runtimes
>>> - use nothing from the C++ standard library, replace everything with custom
>>> data structures, etc) or, perhaps more drastically, change the way the
>>> runtimes are built such that they statically link a private version of, say,
>>> libc++.
>>
>>
>> What's the reason of not static-linking a C++ standard library for
>> sanitizer runtimes back to when it was created?
>
>
> Not sure - Evgeniy (cc'd) might know. Partly perhaps the development cost of
> having to isolate that statically linked library from colliding with any
> other (some kind of mangling scheme would have to be used, I think? to avoid
> such a collision).
This. But we also want to avoid libc++ calling libc, because we may be
inside a libc interceptor. Sanitizer_common stuff mainly uses
internal_* implementations and raw system calls.
Building such an isolated library is hard, especially if it has to be
a static library - then you need to use either relocatable link (which
is buggy) or LTO (which was in a bad shape back then). We do something
like this for the symbolizer (see
lib/sanitizer_common/symbolizer/scripts/build_symbolizer.sh), but not
by default, and it is not integrated in the build system properly.
>
>>
>>
>>>
>>>
>>> Chandler seemed to think maybe we could do this state-side (Tim? Might be
>>> something you could handle) rather than pushing it back on to Dean, if that
>>> sounds reasonable?
>>
>>
>> I believe that "state-side" is LLVM team side?
>
>
> Right, yes, sorry.
>
>>
>> I agree that we should clean up the standard library usage even just for
>> consistency.
>>
>> Searching the xray directory for dependencies:
>> ...compiler-rt/lib/xray % grep '#include <[^>.]*>' -oh `find . -type
>> f|grep -v 'tests'` | sort | uniq -c
>> 1 #include <algorithm>
>> 10 #include <atomic>
>> 1 #include <bitset>
>> 6 #include <cassert>
>> 1 #include <cerrno>
>> 1 #include <cstddef>
>> 7 #include <cstdint>
>> 2 #include <cstdio>
>> 1 #include <cstdlib>
>> 2 #include <cstring>
>> 1 #include <deque>
>> 2 #include <iterator>
>> 2 #include <limits>
>> 2 #include <memory>
>> 4 #include <mutex>
>> 1 #include <system_error>
>> 1 #include <thread>
>> 2 #include <tuple>
>> 1 #include <unordered_map>
>> 1 #include <unordered_set>
>> 3 #include <utility>
>> I think the biggest part is containers, and they are mostly in
>> ./xray_buffer_queue.h and ./xray_fdr_logging.cc.
>>
>> dependencies without buffer queue and fdr logging:
>> ...compiler-rt/lib/xray % grep '#include <[^>.]*>' -oh `find . -type
>> f|egrep -v 'tests|buffer|fdr'` | sort | uniq -c
>> 9 #include <atomic>
>> 4 #include <cassert>
>> 1 #include <cerrno>
>> 1 #include <cstddef>
>> 6 #include <cstdint>
>> 2 #include <cstdio>
>> 1 #include <cstring>
>> 2 #include <iterator>
>> 2 #include <limits>
>> 1 #include <memory>
>> 3 #include <mutex>
>> 1 #include <thread>
>> 2 #include <tuple>
>> 2 #include <utility>
>> I believe that this is relatively easy to cleanup. I can do that.
>>
>> I don't know how hard it is to rewrite buffer queue and fdr logging using
>> compiler_rt infrastructure.
>
>
> I think buffer_queue's probably sufficiently well bounded that it shouldn't
> be drastically hard to replace it with a custom implementation. Haven't
> looked at fdr_logging.
>
> Maps/dictionary-like things might be a bit of a pain in particular. Not sure
> if the sanitizers already have some reusable idioms/libraries for that.
>
> I'm also not really clear on where the boundary is - which headers or
> language features ('new'?) can be used, and which can't. Can't say I've ever
> tried to make code library agnostic.
>
>>
>>
>>>
>>>
>>> (this came up for me due to what's probably a bug in the way compiler-rt
>>> is built - where the lib itself is built with the host compiler but the
>>> tests are built/linked with the just-bulit clang. My host compiler uses
>>> libstdc++ 6, whereas the just-built clang will use libstdc++ 4.8. So it
>>> fails to link due to this mismatch)
More information about the llvm-dev
mailing list