[PATCH] D85603: IR: Add convergence control operand bundle and intrinsics

Sun Aug 9 11:21:02 PDT 2020

efriedma added inline comments.

================
Comment at: llvm/docs/ConvergentOperations.rst:299
+Behavior is undefined if this intrinsic appears inside of another convergence
+region or outside of a function's entry block.
+
----------------
Could we get away without the "outside of a function's entry block" restriction?  It seems sort of inconvenient that transforming a select to an if-then-else requires scanning the entire basic block.  I guess we have to do that scan anyway, though, given the way alloca is defined, so maybe not a big deal.

================
Comment at: llvm/docs/ConvergentOperations.rst:351
+   (2) there is an *n* such that both threads execute U for the *n*'th time
+   with that same token operand value.
+
----------------
Say you have a loop with a non-uniform trip count; does this mean the threads are allowed to communicate for the iterations that both threads execute?

================
Comment at: llvm/docs/ConvergentOperations.rst:365
+   due to a call from IR, then the thread cannot "spontaneously converge" with
+   threads that execute the function for some other reason.)
+
----------------
I'd prefer to define the "call stack" abstraction in a way that doesn't assume the whole world is LLVM IR.

================
Comment at: llvm/docs/ConvergentOperations.rst:368
+4. Target-specific rules determine whether two threads execute the same
+   dynamic instance of an uncontrolled convergent operation.
+
----------------
"uncontolled divergent operation", meaning a convergent operation without a token?  Can you just say that's outside the scope of this document earlier, where you say it's deprecated?

================
Comment at: llvm/docs/ConvergentOperations.rst:404
+barrier sense nor in the control barrier sense of synchronizing the execution
+of threads.
+
----------------
It's a bit of an exaggeration to say it has no effect on the memory model.  Consider the thread group reduction example: there's implicitly some bit of "memory" used to communicate.  (For the definition of readnone, "memory" is anything used to store/communicate state.)  Whether that bit of memory is the same for two instructions depends on whether they correspond to the same dynamic instance.

Of course, if you don't use any attributes, we'll conservatively assume that the memory accessed by an intrinsic depends on the current thread ID or something like that, so this is really only interesting if you're using readonly/readnone/etc.

================
Comment at: llvm/docs/ConvergentOperations.rst:407
+Threads that execute the same dynamic instance do not necessarily do so at the
+same time.
+
----------------
You don't really define "same time" anywhere. That's probably outside the scope of this document anyway, but not sure referring to it here adds anything.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D85603/new/

https://reviews.llvm.org/D85603