[llvm-bugs] [Bug 51703] New: References to extern thread_local variables clobber r11 on darwin

via llvm-bugs llvm-bugs at lists.llvm.org
Wed Sep 1 07:25:05 PDT 2021


https://bugs.llvm.org/show_bug.cgi?id=51703

            Bug ID: 51703
           Summary: References to extern thread_local variables clobber
                    r11 on darwin
           Product: libraries
           Version: trunk
          Hardware: PC
                OS: All
            Status: NEW
          Severity: enhancement
          Priority: P
         Component: Backend: X86
          Assignee: unassignedbugs at nondot.org
          Reporter: nicolasweber at gmx.de
                CC: craig.topper at gmail.com, llvm-bugs at lists.llvm.org,
                    llvm-dev at redking.me.uk, pengfei.wang at intel.com,
                    spatel+llvm at rotateright.com

TLS wrapper functions use calling convention cxx_fast_tlscc, which per langref:

"""On X86-64 the callee preserves all general purpose registers, except for RDI
and RAX."""

When calling a non-dso_local TLS wrapper function on darwin, we'll end up
calling into dyld_stub_binder to to resolve the wrapper function.

dyld_stub_binder clobbers r11:
https://github.com/opensource-apple/dyld/blob/master/src/dyld_stub_binder.s#L203

(Also, the thunks inserted by lld and ld64 do so too, probably since they
figure r11 is already overwritten by dyld_stub_binder)

So we can't use cxx_fast_tlscc for non-dso_local TLS wrapper functions on
darwin.

--

That's of course unfortunate since cxx_fast_tlscc removes lots of stack
traffic. So maybe in time we could change dyld_stub_binder to not clobber
r11...somehow and make the linkers use rax in the stub code, and then use
cxx_fast_tlscc if linkers and targeted macOS versions are new enough. But that
needs changes to dyld, so someone at apple would have to drive this.

--

Here's a standalone repro that shows the bug, but the summary above is really
all you need.

% cat tlvhost.cc
extern thread_local int j;
thread_local int j = 0;

% clang -O2 tlvhost.cc  -std=c++11 -shared -o tlvhost.dylib

% cat tlv.cc
extern thread_local int j;

int f(int a, int b) {
  int c = a * b;
  int d = a + b;
  int e = a / b;
  int f = a - b;
  int g = a - 2 * b;
  int h = a - 3 * b;
  int i = a - 4 * b;
  return c / d * e / f + j + g * h + i;
}

% out/gn/bin/clang -O2 tlv.cc  -std=c++11 -shared tlvhost.dylib -o tlv.dylib

% cat main.cc
#include <iostream>

extern int f(int a, int b);

int main(int argc, char* argv[]) {
  printf("%d\n", f(atoi(argv[1]), atoi(argv[2])));
}

% clang main.cc tlv.dylib

% ./a.out 1 2
-846192167

The output _should_ be 8, and it is 8 if I remove the `+ j +` bit in tlv.cc. (j
is a tlv that's 0.)




(The repro is with clang built at 9b6c8132d3785269512803ff51cb421f8d8bcf0e and
it's dependent on the optimizer. Pasting the asm clang generated for me below,
see how the same r11 issue happens there:

 % out/gn/bin/clang -O2 tlv.cc  -std=c++11 -shared tlvhost.dylib -S -o -
clang: warning: tlvhost.dylib: 'linker' input unused
[-Wunused-command-line-argument]
clang: warning: argument unused during compilation: '-shared'
[-Wunused-command-line-argument]
        .section        __TEXT,__text,regular,pure_instructions
        .build_version macos, 10, 15
        .globl  __Z1fii                         ## -- Begin function _Z1fii
        .p2align        4, 0x90
__Z1fii:                                ## @_Z1fii
        .cfi_startproc
## %bb.0:                               ## %entry
        pushq   %rbp
        .cfi_def_cfa_offset 16
        .cfi_offset %rbp, -16
        movq    %rsp, %rbp
        .cfi_def_cfa_register %rbp
        pushq   %rbx
        pushq   %rax
        .cfi_offset %rbx, -24
                                        ## kill: def $esi killed $esi def $rsi
        movl    %edi, %ecx
        movl    %esi, %r10d
        imull   %edi, %r10d
        leal    (%rsi,%rcx), %r9d
        movl    %edi, %eax
        cltd
        idivl   %esi
        movl    %eax, %r8d
        subl    %esi, %edi
        movl    %ecx, %r11d
        subl    %esi, %r11d
        subl    %esi, %r11d
        leal    (%rsi,%rsi,2), %eax
        movl    %ecx, %ebx
        subl    %eax, %ebx
        shll    $2, %esi
        movl    %r10d, %eax
        cltd
        idivl   %r9d
        imull   %r8d, %eax
        cltd
        idivl   %edi
        movl    %eax, %edx

        # Calling into dyld_stub_binder here:
        callq   __ZTW1j

        # Using the now-clobbered r11 register right after:
        imull   %r11d, %ebx
        subl    %esi, %ecx
        addl    %ebx, %ecx
        addl    %ecx, %edx
        addl    (%rax), %edx
        movl    %edx, %eax
        addq    $8, %rsp
        popq    %rbx
        popq    %rbp
        retq
        .cfi_endproc
                                        ## -- End function
.subsections_via_symbols
)

-- 
You are receiving this mail because:
You are on the CC list for the bug.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-bugs/attachments/20210901/2e797af1/attachment-0001.html>


More information about the llvm-bugs mailing list