[llvm-dev] [RFC] Coroutine and pthread_self

Mon Dec 14 09:00:01 PST 2020

Thank you for the summary, James.

On Thu, Dec 10, 2020 at 9:01 PM James Y Knight <jyknight at google.com> wrote:

> I view what happened is that we've introduced effectively a new form of IR
> (pre-coroutine-split IR), yet when doing so, we failed to realize
> the impact that a "suspend" allowing "thread identity" to change would have
> on the rest of LLVM. So, oops! And now we need to design what things really
> should be looking like here going forward.
>
> Let me try to make the two proposals so far more concrete.
>
> Assumptions:
> - We're not going to move CoroSplit early in the pipeline, so this issue
> really does need to be solved.
> - We should not change the meaning of the C `attribute((const))` to
> prohibit reading the thread-identity, but, we can change what LLVM IR that
> transforms into.
> - As a prerequisite, we have already solved the TLS-address-is-a-constant
> issue -- e.g. by moving TLS address lookup to an intrinsic (I have some
> other comments about that issue, separately).
>

Sounds right.

Proposal 1:
> - The readnone attribute is redefined to prohibit depending on thread id.
>

As I've explained before, I don't think this is a redefinition. It is the
context in which the definition is interpreted that has changed, not the
definition itself.

> - Create a new attribute "readthreadidonly", which is similar to readnone,
> except that it may access the current thread identity.
> - The new @llvm.threadlocaladdr intrinsic (or whatever it ends up as) is
> given this attribute.
> - Clang's `attribute((const))` emits this attribute, instead of readnone.
> - Any other frontends which emit the "readnone" attribute may also need to
> be updated and emit readthreadidonly instead of readonly.
> - Optimization passes that are oblivious to the new readthreadidonly will
> fail correct -- they won't *misoptimize* by moving such a call across an
> llvm.coro.suspend call (good) -- but they also will regress performance of
> straightline code (bad).
> - To avoid regressing performance, we'll need to update all optimization
> passes which currently handle readnone, to also handle readthreadidonly
> similarly.
>    - Option 1: (quick'n'dirty) treat readthreadidonly as readnone
> (allowing arbitrary movement) on any function which doesn't contain an
> llvm.coro.suspend, and treat readthreadidonly as inaccessiblememonly if
> there are any suspends.
>    - Option 2: actually understand that readthreadid can be moved anywhere
> except across an llvm.coro.suspend call.
>

I wonder if there's an:

Option 3: After (or during) coroutine splitting, upgrade readthreadidonly
to readnone.

This is based on the assumption that after coroutine splitting, threadid is
no longer mutable state.

Doing the upgrade any earlier is incorrect because of function inlining.

This is functionally a subset of what Option 1 is doing, but requires
changes to fewer transforms.

Proposal 2:
> - The readnone attribute (continues to) allow depending on thread id.
> - C frontend's `attribute((const))` continues to emit the readnone
> attribute.
> - The new @llvm.threadlocaladdr intrinsic is marked readnone.
> - (Optional) add a not_threadid_dependent attribute, which says that a
> function doesn't care which thread it's invoked on.
> - Neither clang nor other frontends need to change which IR attributes
> they emit.
> - Optimization passes must be taught not to move readnone calls across
> llvm.coro.suspend, or they will be incorrect.
>    - Option 1: (quick'n'dirty) treat 'readnone' as 'inaccessiblememonly'
> in any function which contains a llvm.coro.suspend.
>    - Option 2: actually understand that readnone can move anywhere in the
> function except across an llvm.coro.suspend call (unless
> not_threadid_dependent is present, in which case it can also move across
> that).
>
> I'm still slightly in favor of proposal 2, but either seems OK. Both seem
> like they'll be effectively the same amount of work to implement in LLVM. I
> feel like having the "not_threadid_dependent" attribute be separate from
> "readnone" (doesn't access memory) is logical and consistent -- and having
> them be distinct may potentially also be useful in other circumstances.
>

If we compare the two proposals, they ultimately offer the same level of
expressiveness:

p1 readnone == p2 readnone+not_threadid_dependent
p1 readthreadidonly == p2 readnone
p1 _no attribute_ == p2 _no attribute_

... and so yes, it's plausible that they're effectively the same amount of
work in the end.

The arguments against each of the proposals as far as I'm aware:

Downside of proposal 1: it introduces an attribute "readthreadidonly",
which seems unappealing, presumably because it feels like too much
special-casing of threadid.

Downside of proposal 2: it changes the meaning of "read none" from "reads
no mutable state" to "reads almost no mutable state", which is unintuitive
and invites bugs due to misunderstandings.

I agree that a "readthreadidonly" seems unappealing, but the downside of
proposal 2 weighs far heavier to me. It adds permanent mental baggage that
everybody working with LLVM IR will have to carry around with them for the
indefinite future.

Perhaps there are ways to address the downside of proposal 1? For example,
could we make a parameterized version of `readnone` that reads like
`readnoneexcept(threadid)`? This could also fit nicely with a
`readnoneexcept(inaccessiblemem)`.

Cheers,
Nicolai
-- 
Lerne, wie die Welt wirklich ist,
aber vergiss niemals, wie sie sein sollte.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20201214/a307f4d8/attachment-0001.html>