Incorrect local-dynamic TLS linker optimization with clang-generated code on PowerPC

Chandler Carruth chandlerc at
Wed Jan 28 11:04:02 PST 2015

On Wed, Jan 28, 2015 at 10:06 AM, Bill Schmidt <wschmidt at>

> On Wed, 2015-01-28 at 11:30 -0600, Bill Schmidt wrote:
> *snip*
> >
> > The issue is that we are generating two calls to __get_tls_addr in
> > different basic blocks.  CSE at the MI level recognizes that the address
> > computation can be commoned.  So r29 is copied to r3 before both of the
> > __get_tls_addr calls.
> >
> > I think it would probably be somewhat difficult to avoid this commoning
> > (would have to be specific to these address ops and only when accessing
> > TLS vars for local/global-dynamic).  And in general it's a good thing to
> > do.  Alan, does this complicate matters beyond what the linker can
> > handle?
> Part of the issue here is the late modeling of the call to
> __get_tls_addr as a true call.  The MachineCSE pass does not have any
> logic to common idempotent calls (presumably that is handled at the IR
> level), so the addis/addi get commoned but the __get_tls_call does not.
> Originally we modeled __get_tls_addr as a special target-specific opcode
> that was expanded late.  We had problems with keeping the register
> copies lined up correctly in the presence of multiple TLS variable
> accesses, and we fixed this by using the regular call machinery instead.
> In retrospect, it appears that although this is a much cleaner
> implementation, it limits the ability to CSE the __get_tls_addr calls.
> We can work on a fix to go back to the old model and fix the register
> copies with glue instead, but this would be a pretty substantial change
> to backport to 3.6.  However, if the linker can't handle the code as is,
> then this would seem to be the only way forward.
> Thoughts?

I think that this is the right direction, definitely. I don't feel strongly
about backporting it to 3.6 though, we can just put a workaround such as
the one suggested by Ulrich there (and I'm working on that now). The key is
to make sure 3.7 handles this well.

While I think we should fix the linker, I fear the buggy linkers are
already really widespread and so we should probably ensure in the PPC
backend that r3 is always used. My impression is that much of what you are
talking about in plumbing this through would also allow us to add a
synthetic constraint of that form.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

More information about the llvm-commits mailing list