[llvm-dev] Thread migration during function execution, semantics of thread local variables

Tue Aug 31 14:30:13 PDT 2021

There was a discussion on a very similar topic with regards to C++20
coroutines back in November/December 2020 entitled "[RFC] Coroutine and
pthread_self". It discusses exactly the same issues you will run into --
although for coroutines, the issue only occurs in early optimization
passes, because eventually the coroutine with yield-points gets transformed
into a "normal" function.

Note that TLS access is not the only problem you have -- the removal of
redundant function-calls across a thread-switch will also be a problem,
e.g. as enabled by LLVM IR's "readnone" attribute (which is generated from
C __attribute__((const)) which is present e.g. on pthread_self).

See the thread starting here:
https://lists.llvm.org/pipermail/llvm-dev/2020-November/146766.html
and then into the next month here:
https://lists.llvm.org/pipermail/llvm-dev/2020-December/147012.html

The work in this area has not yet been completed.

On Tue, Aug 31, 2021 at 1:44 PM Valentin Churavy via llvm-dev <
llvm-dev at lists.llvm.org> wrote:

> Hi LLVM-dev,
>
> I am working on a runtime system that has task migration, e.g. a task can
> be migrated between different threads. So a function can start executing on
> one thread, call a function (that might call into the function),
> and then execute onto a different thread. This poses a problem with thread
> local variables. As an example, take the program below. After the call to
> `callee` we might have switched threads and thus we need to
> recalculate the location of the thread local variable.
>
> ```
> @var = available_externally thread_local global i32 0, align 4
>
> declare void @callee()
>
> define signext i32 @main() nounwind {
>     entry:
>     %0 = load i32, i32* @var, align 4
>     call void @callee()
>     %1 = load i32, i32* @var, align 4
>     %2 = icmp eq i32 %0, %1
>     %3 = zext i1 %2 to i32
>     ret i32 %3
> }
> ```
>
> As far as I can tell there is no current mechanism to inform LLVM that
> thread migration might occur, and it depends on the backend what behaviour
> you might get.
>
> As an example compiling with `x86-64-unknown-linux-gnu`, we get:
>
> ```
> movq var at GOTTPOFF(%rip), %rbx
> movl %fs:(%rbx), %ebp
> callq callee at PLT
> xorl %eax, %eax
> cmpl %fs:(%rbx), %ebp
> ```
>
> Which happens to be correct. On Darwin on the other hand:
>
> ```
> movq _var at TLVP(%rip), %rdi
> callq *(%rdi)
> movq %rax, %rbx
> movl (%rax), %ebp
> callq _callee
> xorl %eax, %eax
> cmpl (%rbx), %ebp
> ```
>
> the address for the TLS get's CSE'd, and thus the load could be incorrect.
>
> Has there been any prior work on supporting thread migration + thread
> local storage?
>
> Kind regards,
> Valentin
>
>
>
>
>
>
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20210831/caa3b56a/attachment.html>