[llvm-dev] question about xray tls data initialization

comic fans via llvm-dev llvm-dev at lists.llvm.org
Tue Nov 21 07:32:23 PST 2017


with some dirty hack , I've made xray runtime  'built' on windows ,
but unfortunately I haven't  enough knowledge about linker and the
runtime, and finally built executable didn't run.  I'd like to share
my changes here , hopes somebody help me to make it run on windows.
in AsmPrinter, copy/paster xray for coff target

InstMap = OutContext.getCOFFSection("xray_instr_map", 0,
SectionKind::getReadOnlyWithRel());
FnSledIndex = OutContext.getCOFFSection("xray_fn_idx",
0,SectionKind::getReadOnlyWithRel());

in XRayArgs , allow windows platform to use xray args. with this,
generated code seems have sled and xray parts.

in xray runtime,
 bool atomic_compare_exchange_strong(volatile atomic_sint32_t *a,
                                           s32 *cmp,
                                           s32 xchg,
                                           memory_order mo)
is missed for MSVC , I take atomic_uint32_t implementation

msvc 14.1 treats BufferQueue::Buffer::Buffer as constructor instead of
data member, Buf.Buffer=>Buf.Data

FunctionRecord pack , __attribute__((packed)) =>  #pragma
pack(push,1),  msvc also requires bitfields to be same type to pack
them together( all types => uint32_t)

 FD  int => HANDLE, most code logic still valid (-1 as invalid value),
r/w API replaced with windows

mprotect => VirtualProtect

readTSC in xray_x86_64.inc also works for windows

replace read tsc from proc with QueryPerformanceFrequency

msvc can not compile such code
void setupNewBuffer(int (*wall_clock_reader)(clockid_t,
                                                    struct timespec *));

must use typedef first . xray use clock_gettime as default
implementation , which is not friendly for windows .create a fake one
based on chrono system_clock(ignore clockid_t)

for tls destructor part, I've just commented them out.(but
https://www.codeproject.com/Articles/8113/Thread-Local-Storage-The-C-Way
gives a thread exit callback way for coff)

and last thing , which I don't understand is the weak symbol for
__start_xray_instr_map[]
 __stop_xray_instr_map[]
 __start_xray_fn_idx[]
 __stop_xray_fn_idx[]

I replace them with  __declspec(selectany) , but I'm not sure they
have same meanings.


some random generated code:
    .text
    .intel_syntax noprefix
    .def     call;
    .scl    2;
    .type    32;
    .endef
    .globl    call                    # -- Begin function call
    .p2align    4, 0x90
call:                                   # @call
.seh_proc call
# BB#0:                                 # %entry
    .p2align    1, 0x90
.Lxray_sled_0:
    .ascii    "\353\t"
    nop    word ptr [rax + rax + 512]
    sub    rsp, 16
    .seh_stackalloc 16
    .seh_endprologue
    mov    dword ptr [rsp + 12], ecx
    mov    dword ptr [rsp + 8], 0
    mov    dword ptr [rsp + 4], 0
.LBB0_1:                                # %for.cond
                                        # =>This Inner Loop Header: Depth=1
    mov    eax, dword ptr [rsp + 4]
    cmp    eax, dword ptr [rsp + 12]
    jge    .LBB0_4
# BB#2:                                 # %for.body
                                        #   in Loop: Header=BB0_1 Depth=1
    mov    eax, dword ptr [rsp + 4]
    add    eax, dword ptr [rsp + 8]
    mov    dword ptr [rsp + 8], eax
# BB#3:                                 # %for.inc
                                        #   in Loop: Header=BB0_1 Depth=1
    mov    eax, dword ptr [rsp + 4]
    add    eax, 1
    mov    dword ptr [rsp + 4], eax
    jmp    .LBB0_1
.LBB0_4:                                # %for.end
    mov    eax, dword ptr [rsp + 8]
    add    rsp, 16
    .p2align    1, 0x90
.Lxray_sled_1:
    ret
    nop    word ptr cs:[rax + rax + 512]
    .seh_handlerdata
    .text
    .seh_endproc
                                        # -- End function
    .section    xray_instr_map,"y"
.Lxray_sleds_start0:
    .quad    .Lxray_sled_0
    .quad    call
    .byte    0x00
    .byte    0x00
    .byte    0x00
    .zero    13
    .quad    .Lxray_sled_1
    .quad    call
    .byte    0x01
    .byte    0x00
    .byte    0x00
    .zero    13
.Lxray_sleds_end0:
    .section    xray_fn_idx,"y"
    .p2align    4, 0x90
    .quad    .Lxray_sleds_start0
    .quad    .Lxray_sleds_end0
    .text

and parts of obj dump:


SECTION HEADER #5
     /16 name (xray_instr_map)
       0 physical address
       0 virtual address
      40 size of raw data
     198 file pointer to raw data (00000198 to 000001D7)
     1D8 file pointer to relocation table
       0 file pointer to line numbers
       4 number of relocations
       0 number of line numbers
  100000 flags
         1 byte align

RAW DATA #5
  00000000: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
  00000010: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
  00000020: 56 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  V...............
  00000030: 01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................


RELOCATIONS #5
                                                Symbol    Symbol
 Offset    Type              Applied To         Index     Name
 --------  ----------------  -----------------  --------  ------
 00000000  ADDR64            00000000 00000000         0  .text
 00000008  ADDR64            00000000 00000000         E  call
 00000020  ADDR64            00000000 00000056         0  .text
 00000028  ADDR64            00000000 00000000         E  call

SECTION HEADER #6
      /4 name (xray_fn_idx)
       0 physical address
       0 virtual address
      10 size of raw data
     200 file pointer to raw data (00000200 to 0000020F)
     210 file pointer to relocation table
       0 file pointer to line numbers
       2 number of relocations
       0 number of line numbers
  500000 flags
         16 byte align

RAW DATA #6
  00000000: 00 00 00 00 00 00 00 00 40 00 00 00 00 00 00 00  ........ at .......

RELOCATIONS #6
                                                Symbol    Symbol
 Offset    Type              Applied To         Index     Name
 --------  ----------------  -----------------  --------  ------
 00000000  ADDR64            00000000 00000000         8  xray_instr_map
 00000008  ADDR64            00000000 00000040         8  xray_instr_map

On Tue, Nov 21, 2017 at 7:46 PM, Dean Michael Berris
<dean.berris at gmail.com> wrote:
>
> On 17 Nov 2017, at 00:44, comic fans via llvm-dev <llvm-dev at lists.llvm.org>
> wrote:
>
> I'm learning the xray library and try if it can be built on windows,  in
> xray_fdr_logging_impl.h
>
> line 152  , comment written as
> // Using pthread_once(...) to initialize the thread-local data structures
>
>
> but at line 175, 183, code written as
>
> thread_local pthread_key_t key;
>
>  // Ensure that we only actually ever do the pthread initialization once.
>  thread_local bool UNUSED Unused = [] {
>    new (&TLSBuffer) ThreadLocalData();
>    auto result = pthread_key_create(&key, +[](void *) {
>      auto &TLD = *reinterpret_cast<ThreadLocalData *>(&TLSBuffer);
>
>
> I'm confused that pthread_key_t and Unused are both thread_local
> variable, doesn't it mean the following lambda will run for each
> thread , and create one pthread_key_t for only one tls data(instead of
> only one pthread_key_t for all thread) ? also what does the '+' before
> lambda expression mean ?  this may be stupid questions, could somebody
> kindly  helped ?
>
>
> Yeah, that comment is out-of-date (and the implementation is buggy) -- which
> is a shame really. :/
>
> But, the good news, is I think we've fixed this now in the top-of-trunk with
> https://reviews.llvm.org/D39526 and https://reviews.llvm.org/D40164.
>
> Curiously though, how far did your exploration into getting XRay to build on
> Windows go?
>
> Cheers
>
> -- Dean
>


More information about the llvm-dev mailing list