[llvm-commits] [PATCH, RFC] Support for global-dynamic TLS mode in 64-bit PowerPC target -- preliminary

Bill Schmidt wschmidt at linux.vnet.ibm.com
Mon Dec 10 17:13:06 PST 2012


This WIP patch implements the global-dynamic TLS model for 64-bit PowerPC.
However, it is not ideal, and I need some help figuring out how I can improve
it.

Given a thread-local symbol x with global-dynamic access, the code sequence
to be generated to obtain x's address is:

     Instruction                            Relocation            Symbol
  addis ra,r2,x at got@tlsgd at ha           R_PPC64_GOT_TLSGD16_HA       x
  addi  r3,ra,x at got@tlsgd at l            R_PPC64_GOT_TLSGD16_L        x
  bl __tls_get_addr                    R_PPC64_TLSGD                x
                                       R_PPC64_REL24           __tls_get_addr
  nop
  <use address in r3>

The way I've approached this is to have LowerGlobalTLSAddress convert
the TargetGlobalAddress node into a DAG of three nodes:

  GET_TLS_ADDR(
    ADDI_TLSGD_L(
      ADDIS_TLSGD_HA(X2, x),
      x),
    x)

The problem is that straightforward assembly of this DAG structure gives
the following inferior assembly code:

     Instruction                            Relocation            Symbol
  addis ra,r2,x at got@tlsgd at ha           R_PPC64_GOT_TLSGD16_HA       x
  addi  rb,ra,x at got@tlsgd at l            R_PPC64_GOT_TLSGD16_L        x
  addi  r3,rb,0
  bl    __tls_get_addr                 R_PPC64_TLSGD                x
                                       R_PPC64_REL24           __tls_get_addr
  nop
  addi  rc,r3,0
  <use address in rc>

This is because the call to __tls_get_addr requires its argument and its
return value to use register X3, so copies are generated to move between
the logical registers and the physical register X3.

There are two approaches that I thought of to rectify this, but so far I
don't see how to make either of them work.  Both would be done in
LowerGlobalTLSAddress instead of generating the GET_TLS_ADDR node.
Therefore the copies would be generated early enough that register
assignment could coalesce them away.

 (1) Use LowerCallTo() to create a call sequence with one argument:
     the result of ADDI_TLSGD_L(...), analogously to what's done for
     LowerINIT_TRAMPOLINE.  This seems like the obvious thing to do,
     until you realize that you need a token chain SDNode to generate
     a call, and LowerGlobalTLSAddress doesn't provide one.

 (2) Use getCopyToReg and getCopyFromReg to generate the copies
     directly around GET_TLS_ADDR, which then just expands into the
     "bl" and the "nop".  Unfortunately, these routines also require
     a token chain node.

So, part of my problem is knowing the rules about token chains.  I'm 
pretty new to LLVM and I'm not sure exactly what purposes may be served
by them, other than tying nodes together that otherwise would not be.
Do I really need a token chain node here, or it OK to have a NULL chain
node and use one of the above solutions?  (Seems unlikely, but it would
be convenient if so...)

Assuming I'm not so fortunate, what would be the best way to approach
this problem?


There's another aspect of this patch that doesn't make me happy:  the
hackery I added in PPCMCCodeEmitter.cpp:getDirectBrEncoding().  Each
of these get...Encoding() routines is supposed to be called on behalf
of one operand at a time.  When using integrated assembly, I couldn't
get the second operand of the BL8_NOP_ELF_TLSGD to produce a relocation
the "correct" way, so for now I put in a bloody hack to handle both
operands at once.  Perhaps somebody can see what I'm doing wrong.

The routine I want to call is PPCMCCodeEmitter.cpp:getTLSGDEncoding().
In PPCInstr64Bit.td, I defined the "tlsgd" operand class to use this
method, and specified it as the second input operand for BL8_NOP_ELF_TLSGD.
However, the encoding code generated by TblGen treated this exactly as
BL8_NOP_ELF, which has no second operand, and thus my encoder was never
called.  I didn't see anything particular about IForm_and_DForm_4_zero
that would explain this.  I'm currently at a loss to explain it.

Thanks for any help with my issues!

Bill

-- 
Bill Schmidt, Ph.D.
IBM Advance Toolchain for PowerLinux
IBM Linux Technology Center
wschmidt at us.ibm.com
wschmidt at linux.vnet.ibm.com





-------------- next part --------------
A non-text attachment was scrubbed...
Name: tls-gd-2012-12-10b.patch
Type: text/x-patch
Size: 18470 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20121210/25e4dec6/attachment.bin>


More information about the llvm-commits mailing list