<html>

    <head>

      <base href="https://bugs.llvm.org/">

    </head>

    <body><table border="1" cellspacing="0" cellpadding="8">

        <tr>

          <th>Bug ID</th>

          <td><a class="bz_bug_link 

          bz_status_NEW "

   title="NEW - lld ignores addend for weak undefined TLS symbol in static executable"

   href="https://bugs.llvm.org/show_bug.cgi?id=40570">40570</a>

          </td>

        </tr>

        <tr>

          <th>Summary</th>

          <td>lld ignores addend for weak undefined TLS symbol in static executable

          </td>

        </tr>

        <tr>

          <th>Product</th>

          <td>lld

          </td>

        </tr>

        <tr>

          <th>Version</th>

          <td>unspecified

          </td>

        </tr>

        <tr>

          <th>Hardware</th>

          <td>PC

          </td>

        </tr>

        <tr>

          <th>OS</th>

          <td>Linux

          </td>

        </tr>

        <tr>

          <th>Status</th>

          <td>NEW

          </td>

        </tr>

        <tr>

          <th>Severity</th>

          <td>enhancement

          </td>

        </tr>

        <tr>

          <th>Priority</th>

          <td>P

          </td>

        </tr>

        <tr>

          <th>Component</th>

          <td>ELF

          </td>

        </tr>

        <tr>

          <th>Assignee</th>

          <td>unassignedbugs@nondot.org

          </td>

        </tr>

        <tr>

          <th>Reporter</th>

          <td>rprichard@google.com

          </td>

        </tr>

        <tr>

          <th>CC</th>

          <td>llvm-bugs@lists.llvm.org, peter.smith@linaro.org

          </td>

        </tr></table>

      <p>

        <div>

        <pre>lld appears to discard an addend when it computes the address of a weak

undefined TLS symbol in a static executable. This shows up in a couple of ways.

(FWIW: This issue didn't come up in practice. A weak-undefined test case I

wrote for Bionic stumbled onto it.)

With an LE-model access:

    __attribute__((weak,tls_model("local-exec")))

    extern __thread char tls_var[64];

    char* get_addr() { return &tls_var[32]; }

    // Run: clang -O2 test.c -static -nostdlib -fuse-ld=lld

With lld, the generated instructions use a TPOFF of zero rather than 32. i.e.

&tls_var[0] and &tls_var[32] would evaluate to the same address.

    0000000000201000 <get_addr>:

      201000: 64 48 8b 04 25 00 00  mov    %fs:0x0,%rax

      201007: 00 00 

      201009: 48 8d 80 00 00 00 00  lea    0x0(%rax),%rax

      201010: c3                    retq   

Dropping the addend affects x86-64 in a different way. On that target, a TLS IE

relocation always(?) has a -4 addend, which should apply to the PC-to-GOT

offset but not to the TP-to-symbol offset. When lld relaxes the IE relocation

to LE, it creates a R_RELAX_TLS_IE_TO_LE relocation internally. The -4 addend

is applied for non-weak-undef symbols, then it's canceled back out in

X86_64<ELFT>::relaxTlsIeToLe:

    // The original code used a PC relative relocation.

    // Need to compensate for the -4 it had in the addend.

    write32le(Loc, Val + 4);

For a weak-undef symbol, the -4 addend is ignored here (added in

<a href="https://reviews.llvm.org/D24832">https://reviews.llvm.org/D24832</a>):

    case R_RELAX_TLS_GD_TO_LE:

    case R_RELAX_TLS_IE_TO_LE:

    case R_RELAX_TLS_LD_TO_LE:

    case R_TLS:

      // A weak undefined TLS symbol resolves to the base of the TLS

      // block, i.e. gets a value of zero. If we pass --gc-sections to

      // lld and .tbss is not referenced, it gets reclaimed and we don't

      // create a TLS program header. Therefore, we resolve this

      // statically to zero.

      if (Sym.isTls() && Sym.isUndefWeak())

        return 0;

      return Sym.getVA(A) + getTlsTpOffset();

The result is that lld resolves a weak-undef symbol to 4 on x86-64:

    __attribute__((weak,tls_model("initial-exec")))

    extern __thread char tls_var;

    char* get_addr() { return &tls_var; }

    // Run: clang -O2 test.c -static -nostdlib -fuse-ld=lld

lld output:

    0000000000201000 <get_addr>:

      201000: 64 48 8b 04 25 00 00  mov    %fs:0x0,%rax

      201007: 00 00 

      201009: 48 8d 80 04 00 00 00  lea    0x4(%rax),%rax

      201010: c3                    retq   

Including the addend in getRelocTargetVA corrects the two behaviors described

above. If lld's behavior should change, maybe this is the fix:

diff --git a/ELF/InputSection.cpp b/ELF/InputSection.cpp

index 78f4812a5..c986012cb 100644

--- a/ELF/InputSection.cpp

+++ b/ELF/InputSection.cpp

@@ -741,7 +741,7 @@ static uint64_t getRelocTargetVA(const InputFile *File,

RelType Type, int64_t A,

     // create a TLS program header. Therefore, we resolve this

     // statically to zero.

     if (Sym.isTls() && Sym.isUndefWeak())

-      return 0;

+      return A;

     return Sym.getVA(A) + getTlsTpOffset();

   case R_RELAX_TLS_GD_TO_LE_NEG:

   case R_NEG_TLS:

I'm not sure what guarantees exist for a weak-undefined TLS symbol. While I

think lld's behavior makes sense, AFAICT it is different from bfd/gold. I think

lld's behavior matches what a dynamic linker will do with a dynamic TPREL

relocation.

I studied bfd and gold output for a while (binutils 2.30), on x86-{32,64} and

arm{32,64}, and I tried to summarize the variety of behavior I saw when linking

a static executable. I'm trying to determine what value gets added to the

thread pointer for a reference to a weak-undefined symbol:

 - bfd:

    - Generally, bfd uses the something like (-p_vaddr + getTlsTpOffset()).

      If we instead had the VA of a defined symbol, the LE calculation would

      be (VA - p_vaddr + getTlsTpOffset()), so I think bfd is just treating

      the weak-undef VA as 0. p_vaddr is the address of the TLS initialization

      image, stored in the PT_TLS segment. The resulting &tls_var is not very

      meaningful -- it could point to unmapped memory.

    - [arm64] segfaults if the program has no TLS segment

    - [otherwise] Uses 0 if the program has no TLS segment (matching lld)

 - gold (special case: IE-to-LE on x86-64 only)

    - Uses 0 like lld. This special case doesn't apply to arm{32,64} or x86-32.

      I'm not sure what's up here -- this special case doesn't apply if I just

      use an LE access.

 - gold (otherwise):

    - internal error if the program has no TLS segment

       - typically in relocate_tls

       - for arm32 IE, it's in do_write instead

    - Otherwise uses getTlsTpOffset(), the start of the executable's TLS

      segment.</pre>

        </div>

      </p>

      <hr>

      <span>You are receiving this mail because:</span>

      <ul>

          <li>You are on the CC list for the bug.</li>

      </ul>

    </body>

</html>