[llvm-dev] [RFC] Coroutine and pthread_self

Reid Kleckner via llvm-dev llvm-dev at lists.llvm.org
Mon Nov 23 12:39:25 PST 2020


I don't think it would be a blocker. You can certainly construct a test
case where pthread_self is in the critical path (imagine a macro that uses
pthread_self to do some sort of TLS access), but it seems unlikely. If
coroutines/fibers had always existed and pthread_self was not marked
`const`, I don't think we would be spending the time and effort to add a
new optimization attribute to optimize it.

On Fri, Nov 20, 2020 at 10:36 AM Xun Li <lxfind at gmail.com> wrote:

> Reid,
>
> Thanks for the suggestion. That's a good idea.
> One concern would be, when this new fiber-safe TLS option is enabled,
> pthread_self() will not be optimized even in functions where no
> coroutine is used. Do you think that would be a blocker?
>
> On Fri, Nov 20, 2020 at 7:41 AM Reid Kleckner <rnk at google.com> wrote:
> >
> > This calls to mind the MSVC /GT option for fiber-safe TLS:
> >
> https://docs.microsoft.com/en-us/cpp/build/reference/gt-support-fiber-safe-thread-local-storage?view=msvc-160
> > It seems reasonable to implement something similar in LLVM to solve the
> problem of coroutines and TLS.
> >
> > For pthread_self, instead of inventing a new attribute, would it be
> enough for clang to ignore the attribute when this new fiber-safe TLS
> option is enabled?
> >
> > On Wed, Nov 18, 2020 at 2:07 PM Xun Li via llvm-dev <
> llvm-dev at lists.llvm.org> wrote:
> >>
> >> Hi,
> >>
> >> I would like to propose a potential solution to a bug that involves
> >> coroutine and pthread_self().
> >>
> >> Description of the bug can be found in
> >> https://bugs.llvm.org/show_bug.cgi?id=47833. Below is a summary:
> >> pthread_self() from glibc is defined with "__attribute__
> >> ((__const__))". The const attribute tells the compiler that it does
> >> not read nor write any global state and hence always return the same
> >> result. Hence in the following code:
> >>
> >> auto x1 = pthread_self();
> >> ...
> >> auto x2 = pthread_self();
> >>
> >> the second call to pthread_self() can be optimized out. This has been
> >> correct until coroutines. With coroutines, we can have code like this:
> >>
> >> auto x1 = pthread_self();
> >> co_await ...
> >> auto x2 = pthread_self();
> >>
> >> Now because of the co_await, the function can suspend and resume in a
> >> different thread, in which case the second call to pthread_self()
> >> should return a different result than the first one. Unfortunately
> >> LLVM will still optimize out the second call in the case of
> >> coroutines.
> >>
> >> I tried to just nuke all value reuse whenever a coro.suspend is seen
> >> in all CSE-related passes (https://reviews.llvm.org/D89711), but it
> >> doesn't seem scalable and it puts burden on pass writers. So I would
> >> like to propose a new solution.
> >>
> >> Proposed Solution:
> >> First of all, we need to update the Clang front-end to special handle
> >> the attributes of pthread_self function: replace the ConstAttr
> >> attribute of pthread_self with a new attribute, say "ThreadConstAttr".
> >> Next, in the emitted IR, functions with "ThreadConstAttr" will have a
> >> new IR attribute, say "thread_readnone".
> >> Finally, there are two possible sub-solutions to handle this new IR
> attribute:
> >> a) We add a new Pass after CoroSplitPass that changes all the
> >> "thread_readnone" attributes back to "readnone". This will allow it to
> >> work properly prior to CoroSplit, and still provide a chance to do CSE
> >> after CoroSplit. This approach is simplest to implement.
> >> b) We never remove "thread_readnone". However, we teach memory alias
> >> analysis to understand that functions with "thread_readnone" attribute
> >> will only interfere with coro.suspend intrinsics but nothing else.
> >> Hopefully this will still enable CSE. Not sure how feasible this is.
> >>
> >> Does the above solution (esp (a)) sound reasonable? Any feedback is
> >> appreciated. Thank you!
> >>
> >> A related issue, which may require separate solutions, is that
> >> coroutine also does not work properly with thread local storage. This
> >> is because access to thread local storage in LLVM IR is simply a
> >> reference. However the address to such reference can change after a
> >> coro.suspend. This is not taken care of today.
> >> In this thread I would like to focus on the issue with pthread_self
> >> first, but it's good to have context regarding the thread local
> >> storage issue when discussing solution space.
> >> --
> >> Xun
> >> _______________________________________________
> >> LLVM Developers mailing list
> >> llvm-dev at lists.llvm.org
> >> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>
>
>
> --
> Xun
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20201123/e3217295/attachment.html>


More information about the llvm-dev mailing list