[llvm-bugs] [Bug 40570] New: lld ignores addend for weak undefined TLS symbol in static executable

Fri Feb 1 21:22:01 PST 2019

https://bugs.llvm.org/show_bug.cgi?id=40570

            Bug ID: 40570
           Summary: lld ignores addend for weak undefined TLS symbol in
                    static executable
           Product: lld
           Version: unspecified
          Hardware: PC
                OS: Linux
            Status: NEW
          Severity: enhancement
          Priority: P
         Component: ELF
          Assignee: unassignedbugs at nondot.org
          Reporter: rprichard at google.com
                CC: llvm-bugs at lists.llvm.org, peter.smith at linaro.org

lld appears to discard an addend when it computes the address of a weak
undefined TLS symbol in a static executable. This shows up in a couple of ways.
(FWIW: This issue didn't come up in practice. A weak-undefined test case I
wrote for Bionic stumbled onto it.)

With an LE-model access:

    __attribute__((weak,tls_model("local-exec")))
    extern __thread char tls_var[64];

    char* get_addr() { return &tls_var[32]; }

    // Run: clang -O2 test.c -static -nostdlib -fuse-ld=lld

With lld, the generated instructions use a TPOFF of zero rather than 32. i.e.
&tls_var[0] and &tls_var[32] would evaluate to the same address.

    0000000000201000 <get_addr>:
      201000: 64 48 8b 04 25 00 00  mov    %fs:0x0,%rax
      201007: 00 00 
      201009: 48 8d 80 00 00 00 00  lea    0x0(%rax),%rax
      201010: c3                    retq   

Dropping the addend affects x86-64 in a different way. On that target, a TLS IE
relocation always(?) has a -4 addend, which should apply to the PC-to-GOT
offset but not to the TP-to-symbol offset. When lld relaxes the IE relocation
to LE, it creates a R_RELAX_TLS_IE_TO_LE relocation internally. The -4 addend
is applied for non-weak-undef symbols, then it's canceled back out in
X86_64<ELFT>::relaxTlsIeToLe:

    // The original code used a PC relative relocation.
    // Need to compensate for the -4 it had in the addend.
    write32le(Loc, Val + 4);

For a weak-undef symbol, the -4 addend is ignored here (added in
https://reviews.llvm.org/D24832):

    case R_RELAX_TLS_GD_TO_LE:
    case R_RELAX_TLS_IE_TO_LE:
    case R_RELAX_TLS_LD_TO_LE:
    case R_TLS:
      // A weak undefined TLS symbol resolves to the base of the TLS
      // block, i.e. gets a value of zero. If we pass --gc-sections to
      // lld and .tbss is not referenced, it gets reclaimed and we don't
      // create a TLS program header. Therefore, we resolve this
      // statically to zero.
      if (Sym.isTls() && Sym.isUndefWeak())
        return 0;
      return Sym.getVA(A) + getTlsTpOffset();

The result is that lld resolves a weak-undef symbol to 4 on x86-64:

    __attribute__((weak,tls_model("initial-exec")))
    extern __thread char tls_var;

    char* get_addr() { return &tls_var; }

    // Run: clang -O2 test.c -static -nostdlib -fuse-ld=lld

lld output:

    0000000000201000 <get_addr>:
      201000: 64 48 8b 04 25 00 00  mov    %fs:0x0,%rax
      201007: 00 00 
      201009: 48 8d 80 04 00 00 00  lea    0x4(%rax),%rax
      201010: c3                    retq   

Including the addend in getRelocTargetVA corrects the two behaviors described
above. If lld's behavior should change, maybe this is the fix:

diff --git a/ELF/InputSection.cpp b/ELF/InputSection.cpp
index 78f4812a5..c986012cb 100644
--- a/ELF/InputSection.cpp
+++ b/ELF/InputSection.cpp
@@ -741,7 +741,7 @@ static uint64_t getRelocTargetVA(const InputFile *File,
RelType Type, int64_t A,
     // create a TLS program header. Therefore, we resolve this
     // statically to zero.
     if (Sym.isTls() && Sym.isUndefWeak())
-      return 0;
+      return A;
     return Sym.getVA(A) + getTlsTpOffset();
   case R_RELAX_TLS_GD_TO_LE_NEG:
   case R_NEG_TLS:

I'm not sure what guarantees exist for a weak-undefined TLS symbol. While I
think lld's behavior makes sense, AFAICT it is different from bfd/gold. I think
lld's behavior matches what a dynamic linker will do with a dynamic TPREL
relocation.

I studied bfd and gold output for a while (binutils 2.30), on x86-{32,64} and
arm{32,64}, and I tried to summarize the variety of behavior I saw when linking
a static executable. I'm trying to determine what value gets added to the
thread pointer for a reference to a weak-undefined symbol:

 - bfd:
    - Generally, bfd uses the something like (-p_vaddr + getTlsTpOffset()).
      If we instead had the VA of a defined symbol, the LE calculation would
      be (VA - p_vaddr + getTlsTpOffset()), so I think bfd is just treating
      the weak-undef VA as 0. p_vaddr is the address of the TLS initialization
      image, stored in the PT_TLS segment. The resulting &tls_var is not very
      meaningful -- it could point to unmapped memory.
    - [arm64] segfaults if the program has no TLS segment
    - [otherwise] Uses 0 if the program has no TLS segment (matching lld)

 - gold (special case: IE-to-LE on x86-64 only)
    - Uses 0 like lld. This special case doesn't apply to arm{32,64} or x86-32.
      I'm not sure what's up here -- this special case doesn't apply if I just
      use an LE access.

 - gold (otherwise):
    - internal error if the program has no TLS segment
       - typically in relocate_tls
       - for arm32 IE, it's in do_write instead
    - Otherwise uses getTlsTpOffset(), the start of the executable's TLS
      segment.

-- 
You are receiving this mail because:
You are on the CC list for the bug.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-bugs/attachments/20190202/2abebf77/attachment.html>