[PATCH] D124361: [Coroutines] Add coro_maychange intrinsic to solve TLS problem (2/5)

Thu Apr 28 16:56:44 PDT 2022

rjmccall added inline comments.

================
Comment at: clang/lib/CodeGen/CGExpr.cpp:2623
+      CGF.getLangOpts().Coroutines)
+    V = CGF.Builder.CreateCoroMayChange(V);
+
----------------
efriedma wrote:
> rjmccall wrote:
> > I guess this is unnecessary under OpenMP because the privatization logic will already anchor this appropriately in the function.  That's worth mentioning in a comment so that readers don't think the combination is somehow busted.
> > 
> > Same question as the other patch: is there any way to safely only do this in code that's actually going to be part of a coroutine?  Because `getLangOpts().Coroutines` is true for everyone using `-std=c++20`, and most of that code is probably not using coroutines.  It seems like we have two options:
> > - Use this pattern everywhere in the translation unit, and then eliminate it from all functions after we've split all coroutines
> > - Use this pattern in the frontend when directly emitting unsplit coroutine bodies, and also change LLVM to introduce this pattern when cloning code into an unsplit coroutine body (most importantly, in the inliner)
> > 
> > I think the second option makes more sense.  In the first option, the mere possibility of having coroutines in the module could significantly impede optimization.  The second option also means you'll have to find and remove this pattern in *all* functions, not just when transforming coroutines.
> > 
> > You could also use the second option for your `readnone` problem: you can have your early pass make `readnone` calls directly from coroutine bodies `coro_readnone` instead, and the inliner can do the same for `readnone` calls being inlined into unsplit coroutines.
> I'm not really happy with the idea that IR semantics are different in the body of a coroutine; that implies that every interprocedural analysis  or optimization author has to think about the possible interactions with coroutine semantics.  It's much simpler to just pretend a coroutine is a thread.  I mean, you mention inlining, but there's also IPSCCP, and Attributor, maybe GlobalOpt, and other interactions I'm probably not thinking of.
> 
> And really, as long as alias analysis understands llvm.coro.maychange, we can do almost all the optimizations we want to do on thread-local variables anyway.
> 
> But maybe I'm leaning too far towards conceptual purity, as opposed to practicality.
Like I said in Discourse, I think our hands are pretty forced here for TLS, because LLVM IR is assuming in its basic representation that functions are pinned to a single thread, which simply is not true in coroutine bodies.  We have three options:

1. We can globally change the structure of TLS access.  You would not be able to simply use `thread_local` GVs as constants; instead, you would have to use some instruction which yields the address of the TLS for the current thread.
2. We can require different structure for TLS access in unlowered coroutines.
3. We can decline to support coroutines.

I don't think #3 is an option, and #1 isn't really a fight I want to fight, so we're left with #2.  Maybe there's an option I'm not thinking of, though.

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D124361/new/

https://reviews.llvm.org/D124361