[llvm-bugs] [Bug 40570] New: lld ignores addend for weak undefined TLS symbol in static executable
via llvm-bugs
llvm-bugs at lists.llvm.org
Fri Feb 1 21:22:01 PST 2019
https://bugs.llvm.org/show_bug.cgi?id=40570
Bug ID: 40570
Summary: lld ignores addend for weak undefined TLS symbol in
static executable
Product: lld
Version: unspecified
Hardware: PC
OS: Linux
Status: NEW
Severity: enhancement
Priority: P
Component: ELF
Assignee: unassignedbugs at nondot.org
Reporter: rprichard at google.com
CC: llvm-bugs at lists.llvm.org, peter.smith at linaro.org
lld appears to discard an addend when it computes the address of a weak
undefined TLS symbol in a static executable. This shows up in a couple of ways.
(FWIW: This issue didn't come up in practice. A weak-undefined test case I
wrote for Bionic stumbled onto it.)
With an LE-model access:
__attribute__((weak,tls_model("local-exec")))
extern __thread char tls_var[64];
char* get_addr() { return &tls_var[32]; }
// Run: clang -O2 test.c -static -nostdlib -fuse-ld=lld
With lld, the generated instructions use a TPOFF of zero rather than 32. i.e.
&tls_var[0] and &tls_var[32] would evaluate to the same address.
0000000000201000 <get_addr>:
201000: 64 48 8b 04 25 00 00 mov %fs:0x0,%rax
201007: 00 00
201009: 48 8d 80 00 00 00 00 lea 0x0(%rax),%rax
201010: c3 retq
Dropping the addend affects x86-64 in a different way. On that target, a TLS IE
relocation always(?) has a -4 addend, which should apply to the PC-to-GOT
offset but not to the TP-to-symbol offset. When lld relaxes the IE relocation
to LE, it creates a R_RELAX_TLS_IE_TO_LE relocation internally. The -4 addend
is applied for non-weak-undef symbols, then it's canceled back out in
X86_64<ELFT>::relaxTlsIeToLe:
// The original code used a PC relative relocation.
// Need to compensate for the -4 it had in the addend.
write32le(Loc, Val + 4);
For a weak-undef symbol, the -4 addend is ignored here (added in
https://reviews.llvm.org/D24832):
case R_RELAX_TLS_GD_TO_LE:
case R_RELAX_TLS_IE_TO_LE:
case R_RELAX_TLS_LD_TO_LE:
case R_TLS:
// A weak undefined TLS symbol resolves to the base of the TLS
// block, i.e. gets a value of zero. If we pass --gc-sections to
// lld and .tbss is not referenced, it gets reclaimed and we don't
// create a TLS program header. Therefore, we resolve this
// statically to zero.
if (Sym.isTls() && Sym.isUndefWeak())
return 0;
return Sym.getVA(A) + getTlsTpOffset();
The result is that lld resolves a weak-undef symbol to 4 on x86-64:
__attribute__((weak,tls_model("initial-exec")))
extern __thread char tls_var;
char* get_addr() { return &tls_var; }
// Run: clang -O2 test.c -static -nostdlib -fuse-ld=lld
lld output:
0000000000201000 <get_addr>:
201000: 64 48 8b 04 25 00 00 mov %fs:0x0,%rax
201007: 00 00
201009: 48 8d 80 04 00 00 00 lea 0x4(%rax),%rax
201010: c3 retq
Including the addend in getRelocTargetVA corrects the two behaviors described
above. If lld's behavior should change, maybe this is the fix:
diff --git a/ELF/InputSection.cpp b/ELF/InputSection.cpp
index 78f4812a5..c986012cb 100644
--- a/ELF/InputSection.cpp
+++ b/ELF/InputSection.cpp
@@ -741,7 +741,7 @@ static uint64_t getRelocTargetVA(const InputFile *File,
RelType Type, int64_t A,
// create a TLS program header. Therefore, we resolve this
// statically to zero.
if (Sym.isTls() && Sym.isUndefWeak())
- return 0;
+ return A;
return Sym.getVA(A) + getTlsTpOffset();
case R_RELAX_TLS_GD_TO_LE_NEG:
case R_NEG_TLS:
I'm not sure what guarantees exist for a weak-undefined TLS symbol. While I
think lld's behavior makes sense, AFAICT it is different from bfd/gold. I think
lld's behavior matches what a dynamic linker will do with a dynamic TPREL
relocation.
I studied bfd and gold output for a while (binutils 2.30), on x86-{32,64} and
arm{32,64}, and I tried to summarize the variety of behavior I saw when linking
a static executable. I'm trying to determine what value gets added to the
thread pointer for a reference to a weak-undefined symbol:
- bfd:
- Generally, bfd uses the something like (-p_vaddr + getTlsTpOffset()).
If we instead had the VA of a defined symbol, the LE calculation would
be (VA - p_vaddr + getTlsTpOffset()), so I think bfd is just treating
the weak-undef VA as 0. p_vaddr is the address of the TLS initialization
image, stored in the PT_TLS segment. The resulting &tls_var is not very
meaningful -- it could point to unmapped memory.
- [arm64] segfaults if the program has no TLS segment
- [otherwise] Uses 0 if the program has no TLS segment (matching lld)
- gold (special case: IE-to-LE on x86-64 only)
- Uses 0 like lld. This special case doesn't apply to arm{32,64} or x86-32.
I'm not sure what's up here -- this special case doesn't apply if I just
use an LE access.
- gold (otherwise):
- internal error if the program has no TLS segment
- typically in relocate_tls
- for arm32 IE, it's in do_write instead
- Otherwise uses getTlsTpOffset(), the start of the executable's TLS
segment.
--
You are receiving this mail because:
You are on the CC list for the bug.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-bugs/attachments/20190202/2abebf77/attachment.html>
More information about the llvm-bugs
mailing list