[PATCH] D85603: IR: Add convergence control operand bundle and intrinsics

Wed Aug 12 00:03:29 PDT 2020

sameerds added inline comments.

================
Comment at: llvm/docs/ConvergentOperations.rst:280
+We generally take the stance that reconvergence in acyclic control flow must
+be maximal. The compiler frontend could augment the original code as follows:
+
----------------
It was the optimizer that introduced the ambiguity ... should the optimizer be responsible for adding the necessary intrinsics that preserve the original convergence? 

================
Comment at: llvm/docs/ConvergentOperations.rst:551-553
+This intrinsic defines the *heart* of a loop, i.e. the place where an imaginary
+loop counter is incremented for the purpose of determining convergence
+semantics.
----------------
So the heart is not a property of the loop itself in LLVM IR. It is a place chosen by the frontend based on semantics external to LLVM IR, in a way that allows the frontend to express constraints about convergence in the loop.

================
Comment at: llvm/docs/ConvergentOperations.rst:570
+
+  token @llvm.experimental.convergence.anchor() convergent readnone
+
----------------
Just like the loop intrinsic, this intrinsic occurs in a place chosen by the frontend based on semantics outside of LLVM IR, and used by the frontend to express constraints elsewhere in the IR.

================
Comment at: llvm/docs/ConvergentOperations.rst:611-612
+        dynamic instance of the defining instruction, and
+     2. There is an *n* such that both threads execute U for the *n*'th time
+        with that same token operand value.
+
----------------
The older comments about this seem to have floated away. At risk of repeating the discussion, what is *n* capturing? Is it meant to relate copies of the call U created by unrolling the loop, for example?

================
Comment at: llvm/docs/ConvergentOperations.rst:646-649
+2. Every cycle in the CFG that contains two or more static uses of a
+   convergence token T by
+   :ref:`llvm.experimental.convergence.loop <llvm.experimental.convergence.loop>`
+   must also contain the definition of T.
----------------
Just like the *n* property of the loop intrinsic, I think an informational note explaining this will be helpful.

================
Comment at: llvm/docs/ConvergentOperations.rst:653-656
+3. The *convergence region* of a convergence token T is the minimal region in
+   which T is live and used (i.e., the program points dominated by the
+   definition D of T from which a use of T can be reached without leaving the
+   region dominated by D).
----------------
This is not a rule; it's just a definition.

================
Comment at: llvm/docs/ConvergentOperations.rst:658-660
+4. If a convergence region contains a use of a convergence token, then it must
+   also contain its definition. (In other words, convergence regions must be
+   reasonably nested.)
----------------
Since a convergence region is defined for a token, this text needs to bring out the fact that two different tokens are being talked about at this point. Something like: If the convergence region for token T1 contains a use of another token T2, then it must also contain the definition of T2."

================
Comment at: llvm/docs/ConvergentOperations.rst:749-754
+  %outer = call token @llvm.experimental.convergence.anchor()
+  while (counter > 0) {
+    %inner = call token @llvm.experimental.convergence.loop() [ "convergencectrl"(token %outer) ]
+    call void @convergent.operation() [ "convergencectrl"(token %inner) ]
+    counter--;
+  }
----------------
So unrolling is forbidden because it fails to preserve the set of threads that execute the same dynamic instance of loop() for n=0 and n=1?

================
Comment at: llvm/docs/ConvergentOperations.rst:759
+if the loop counter is known to be a multiple, then unrolling is allowed,
+though care must be taken to correct the use of the loop intrinsic.
+For example, unrolling by 2:
----------------
Correcting the use of the loop intrinsic seems to be a delicate matter. There is a rule which talks about "two or more uses by loop()" inside a loop body, and this particular example seems to side-step exactly that by eliminating one call to loop().

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D85603/new/

https://reviews.llvm.org/D85603