[PATCH] D85603: IR: Add convergence control operand bundle and intrinsics

Tue Aug 11 01:25:25 PDT 2020

sameerds added inline comments.

================
Comment at: llvm/docs/ConvergentOperations.rst:401
+
+4. If a convergence region contains a use of a convergence token, then it must
+   also contain its definition.
----------------
So this defines a proper nesting of convergence regions? An informative note would be helpful.

================
Comment at: llvm/docs/ConvergentOperations.rst:464-470
+  while (counter >= 2) {
+    %tok = call token @llvm.experimental.convergence.anchor()
+    call void @convergent.operation() [ "convergencectrl"(token %tok) ]
+    %tok = call token @llvm.experimental.convergence.anchor()
+    call void @convergent.operation() [ "convergencectrl"(token %tok) ]
+    counter -= 2;
+  }
----------------
Which part of the formal semantics shows that this is a valid translation? Rule for the execution of dynamic instances seems to be useful to only specify which threads execute the convergent operations. But what relates them to the original loop? Is it because the set of dynamic instances produced by the second version has a one-to-one mapping with the set of dynamic instances produced by the first version?

================
Comment at: llvm/docs/ConvergentOperations.rst:516
+:ref:`llvm.experimental.convergence.loop <llvm.experimental.convergence.loop>`
+intrinsic outside of the loop header uses a token defined outside of the loop
+can generally not be unrolled.
----------------
I think this intends to say "block in the loop body other than the loop header", but the wording chosen is a little difficult to parse on a first read.

================
Comment at: llvm/docs/ConvergentOperations.rst:202-203
+
+2. Executions of different static instructions always occur in different
+   dynamic instances.
+
----------------
simoll wrote:
> I suppose this only refers to convergent instructions but it isn't clear to me from the wording: Does this constraint apply to all IR instructions or only those that are convergent?
> (Only 4. explicitly mentions convergent operations)
I think the notion of dynamic instances applies to all instructions. Continuing with #3 below, it seems to me that different threads can execute the same dynamic instance of any instruction. It's just that this notion is not very interesting in the case of non-communicating instructions. The ones that communicate need to be marked convergent, so that the effect of transformations on them is limited.

================
Comment at: llvm/docs/ConvergentOperations.rst:522-524
+Assuming that ``%tok`` is only used inside the conditional block, the anchor can
+be sunk. Again, the rationale is that the anchor has implementation-defined
+behavior, and the sinking is part of the implementation.
----------------
t-tye wrote:
> This also confuses me. If anchor is supposed to denote the current set of threads in the current dynamic instance, then it seems undefined IR to use it in the conditional when all those threads cannot be performing the dynamic operation instance. I feel I am missing a fundamental aspect of the formal model.
+1

To me, the whole point of this new concept is to capture control dependency so that we don't have to go look at branch conditions again. But allowing such a transformation reintroduces the need to go check the control dependency to understand which threads are really executing this instance.

================
Comment at: llvm/docs/ConvergentOperations.rst:547-551
+  }
+
+The behavior is unchanged, since each of the static convergent operations only
+ever communicates with threads that have the same ``condition`` value.
+By contrast, hoisting the convergent operations themselves is forbidden.
----------------
t-tye wrote:
> So the convergent token is the set of threads, but any intervening conditional control flow may change which threads a nested convergent operation may be required to communicate with?
> 
> My understanding was that the tokens were intended to be explicit in denoting the involved threads to avoid needing to crawl the LLVM IR to determine the control dependence. And were intended to be explicit in preventing control dependence changes. But these examples seem to contradict that understanding.
> 
> So when a convergent token is used in a dynamic instance of a static convergent operation, what set of threads is it mandating have to participate? Those defined by the dynamic instance of the static token definition that control dependence permits to execute?
This is also the transform that CUDA (and potentially HIP) will disallow. Hoisting or sinking a conditional changes the set of threads executing the each leg of the branch. In CUDA, the two programs have completely different meanings depend on whether the anchor is outside the branch or inside each leg. There seems to be an opportunity here to relate the notion of an anchor to language builtins that return the mask of currently executing threads.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D85603/new/

https://reviews.llvm.org/D85603