[llvm-dev] XRay TID mismatch when forking

Henry Zhu via llvm-dev llvm-dev at lists.llvm.org
Thu Jun 7 06:51:24 PDT 2018


Hello,



There seems to be a bug with the value of the thread ID (TID) that is
generated by XRay when calling fork(). The value of the TID of the child
has the same TID as its parent (this is seen in the “thread: “ field of the
YAML output produced by the llvm-xray tool). This is a problem as the trace
generated contains the same TID for the parent and children processes,
making it difficult to distinguish between threads running in different
processes.


I am running linux on x86_64 machine and am using clang 7.0.

henry at OptiPlex-9010:~$ cat /etc/issue
Ubuntu 16.04.4 LTS \n \l

henry at OptiPlex-9010:~$ uname -a
Linux OptiPlex-9010 4.13.0-43-generic #48~16.04.1-Ubuntu SMP Thu May 17
12:56:46 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux

Together with this email I include sample code that should allow you to
reproduce the problem, as well as the output I obtain from running that
code, and that shows the problem described above.

Below is a test file I wrote to show the bug:

The parent process prints its own tid, calls fork ten times and for each
fork, the child prints own TID using the gettid syscall. The syscall code
is based on that of the sanitizer code – to make sure that I am using the
same mechanism as xray to obtain the thread ID from the platform.

=================

#include <stdio.h>

#include <unistd.h>

#include <stdint.h>

#include <sys/syscall.h>



//taken from sanitizer



#define SYSCALL(name) __NR_ ## name



static uintptr_t internal_syscall(uint64_t nr) {

  uint64_t retval;

  asm volatile("syscall" : "=a"(retval) : "a"(nr) : "rcx", "r11",

               "memory", "cc");

  return retval;

}



/////////////



int main()

{

                printf("\nTID of parent: %lu\n",
internal_syscall(SYSCALL(gettid)));

                for(int i = 0; i < 10; i++)

                {

                                if(fork())

                                {

                                }

                                else

                                {

                                                printf("\nTID of child:
%lu\n", internal_syscall(SYSCALL(gettid)));

                                                break;

                                }

                }

                return 0;

}


I compiled the test file using clang 7.0 and ran it:

=================



./clang++ -fxray-instrument test.cpp –o test

XRAY_OPTIONS="patch_premain=true xray_mode=xray-basic verbosity=1" ./test


The output shows that logger prints out the parent TID 11 times, but making
the gettid syscall in the test shows that the TIDs of the children are
different than their parent process.

Here is the output:

=================




==24020==XRay: Log file in 'xray-log.test.yLWPSD'



TID of parent: 24020



TID of child: 24021



TID of child: 24022



TID of child: 24023



TID of child: 24024



TID of child: 24025



TID of child: 24026



TID of child: 24027



TID of child: 24028



TID of child: 24029



TID of child: 24030

==24021==Cleaned up log for TID: 24020

==24022==Cleaned up log for TID: 24020

==24024==Cleaned up log for TID: 24020

==24026==Cleaned up log for TID: 24020

==24025==Cleaned up log for TID: 24020

==24023==Cleaned up log for TID: 24020

==24028==Cleaned up log for TID: 24020

==24027==Cleaned up log for TID: 24020

==24029==Cleaned up log for TID: 24020

==24030==Cleaned up log for TID: 24020

==24020==Cleaned up log for TID: 24020

=================



The llvm-xray tool reports the same TID across all threads when running the
tool on the generated trace file.

=================

---


header:


  version:         2


  type:            0


  constant-tsc:    true


  nonstop-tsc:     true


  cycle-frequency: 3400000000


records:


  - { type: 0, func-id: 1, function: main, cpu: 3, thread: 24065, kind:
function-enter, tsc: 4300740411623868 }


  - { type: 0, func-id: 1, function: main, cpu: 3, thread: 24065, kind:
function-enter, tsc: 4300740411623868 }


  - { type: 0, func-id: 1, function: main, cpu: 3, thread: 24065, kind:
function-enter, tsc: 4300740411623868 }


  - { type: 0, func-id: 1, function: main, cpu: 3, thread: 24065, kind:
function-enter, tsc: 4300740411623868 }


  - { type: 0, func-id: 1, function: main, cpu: 3, thread: 24065, kind:
function-enter, tsc: 4300740411623868 }


  - { type: 0, func-id: 1, function: main, cpu: 3, thread: 24065, kind:
function-enter, tsc: 4300740411623868 }


  - { type: 0, func-id: 1, function: main, cpu: 3, thread: 24065, kind:
function-enter, tsc: 4300740411623868 }


  - { type: 0, func-id: 1, function: main, cpu: 3, thread: 24065, kind:
function-enter, tsc: 4300740411623868 }


  - { type: 0, func-id: 1, function: main, cpu: 3, thread: 24065, kind:
function-enter, tsc: 4300740411623868 }


  - { type: 0, func-id: 1, function: main, cpu: 3, thread: 24065, kind:
function-enter, tsc: 4300740411623868 }


  - { type: 0, func-id: 1, function: main, cpu: 3, thread: 24065, kind:
function-enter, tsc: 4300740411623868 }


  - { type: 0, func-id: 1, function: main, cpu: 0, thread: 24065, kind:
function-exit, tsc: 4300740412091416 }


  - { type: 0, func-id: 1, function: main, cpu: 1, thread: 24065, kind:
function-exit, tsc: 4300740412505836 }


  - { type: 0, func-id: 1, function: main, cpu: 0, thread: 24065, kind:
function-exit, tsc: 4300740412671366 }


  - { type: 0, func-id: 1, function: main, cpu: 1, thread: 24065, kind:
function-exit, tsc: 4300740412775572 }


  - { type: 0, func-id: 1, function: main, cpu: 0, thread: 24065, kind:
function-exit, tsc: 4300740412934286 }


  - { type: 0, func-id: 1, function: main, cpu: 0, thread: 24065, kind:
function-exit, tsc: 4300740413141394 }


  - { type: 0, func-id: 1, function: main, cpu: 2, thread: 24065, kind:
function-exit, tsc: 4300740413259282 }


  - { type: 0, func-id: 1, function: main, cpu: 0, thread: 24065, kind:
function-exit, tsc: 4300740413338498 }


  - { type: 0, func-id: 1, function: main, cpu: 0, thread: 24065, kind:
function-exit, tsc: 4300740413514362 }


  - { type: 0, func-id: 1, function: main, cpu: 3, thread: 24065, kind:
function-exit, tsc: 4300740413559132 }


- { type: 0, func-id: 1, function: main, cpu: 1, thread: 24065, kind:
function-exit, tsc: 4300740413725842 }



Is this an issue that can be fixed? If not, is there any advice on how to
make xray generate different TIDs when forking on my end?


Attached below is the file I tested with.

Thanks,

Henry
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20180607/66697b4c/attachment.html>
-------------- next part --------------
#include <stdio.h>
#include <unistd.h>
#include <stdint.h>
#include <sys/syscall.h>

//taken from sanitizer

#define SYSCALL(name) __NR_ ## name

static uintptr_t internal_syscall(uint64_t nr) {
  uint64_t retval;
  asm volatile("syscall" : "=a"(retval) : "a"(nr) : "rcx", "r11",
               "memory", "cc");
  return retval;
}

/////////////

int main()
{
  	printf("\nTID of parent: %lu\n", internal_syscall(SYSCALL(gettid)));
	for(int i = 0; i < 10; i++)
	{
		if(fork())
		{
		}
		else
		{
  			printf("\nTID of child: %lu\n", internal_syscall(SYSCALL(gettid)));
			break;
		}
	}
	return 0;
}


More information about the llvm-dev mailing list