r180739 - Emit the TLS intialization functions into a list.

Richard Smith richard at metafoo.co.uk
Mon Apr 29 17:44:17 PDT 2013


On Mon, Apr 29, 2013 at 5:29 PM, John McCall <rjmccall at apple.com> wrote:

> On Apr 29, 2013, at 5:18 PM, Richard Smith <richard at metafoo.co.uk> wrote:
>
> On Mon, Apr 29, 2013 at 5:13 PM, Bill Wendling <isanbard at gmail.com> wrote:
>
>> On Apr 29, 2013, at 5:09 PM, Richard Smith <richard at metafoo.co.uk> wrote:
>>
>> > On Mon, Apr 29, 2013 at 3:27 PM, Bill Wendling <isanbard at gmail.com>
>> wrote:
>> > Author: void
>> > Date: Mon Apr 29 17:27:16 2013
>> > New Revision: 180739
>> >
>> > URL: http://llvm.org/viewvc/llvm-project?rev=180739&view=rev
>> > Log:
>> > Emit the TLS intialization functions into a list.
>> >
>> > Add the TLS initialization functions to a list of initialization
>> functions. The
>> > back-end takes this list and places the function pointers into the
>> correct
>> > section. This way they're called before `main().'
>> >
>> > Why?
>>
>> "Why" what? Just as the description says, we place the function pointers
>> into the correct section and, at least on Darwin, the TLS variables are
>> initialized via calls to the those function pointers. Just like global
>> c'tors.
>
>
> OK, but why are we switching to eagerly initializing them rather than
> doing it lazily? That's going to be extremely expensive for applications
> which start lots of threads (say, by using std::async) and have
> thread_local variables with non-trivial construction or destruction -- and
> we still emit the dynamic initialization on every odr-use, so it doesn't
> seem to save us anything.
>
>
> This should be a Darwin-specific change.
>
> The Darwin TLS model is that thread-local variables are lazily allocated
> and initialized, but only at the granularity of a single linkage unit.
>  That is, as soon as one thread-local variable is touched, every other
> thread-local variable in that linkage unit is initialized at the same time.
>  The linker implicitly synthesizes the access functions and, to do so, must
> receive a list of constructor functions to run, which is what this change
> collects.  I'm not sure I can fully defend this design, but it's what we've
> got right now.
>

Thanks, that makes sense. This presumably should be handled in the CGCXXABI
layer, and should definitely be done instead of generating and using the
Itanium _ZTH / _ZTW functions, rather than in addition to them -- you'll
get no benefit from this the current way. (This is also not ABI-compatible
with the Itanium proposal for thread_local, but I assume you are fine with
that.)

Also, you'll need ensure that the first time each thread_local variable is
odr-used, it is "touched" enough to trigger initialization -- you can't
optimize an odr-use away or you'll change the meaning of the code. For
instance:

thread_local int n = puts("hello world");
void puts_once() { n; } // guarantees the 'puts' is issued
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/cfe-commits/attachments/20130429/716ddd43/attachment.html>


More information about the cfe-commits mailing list