[PATCH] D68994: [RFC] Redefine `convergent` in terms of dynamic instances

Nicolai Hähnle via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Tue Oct 15 08:56:38 PDT 2019


nhaehnle created this revision.
nhaehnle added reviewers: arsenm, alex-t, tpr, t-tye.
Herald added subscribers: jdoerfert, zzheng, hiraditya, wdng.
Herald added a project: LLVM.
nhaehnle removed reviewers: arsenm, alex-t, tpr, t-tye.
nhaehnle added subscribers: arsenm, alex-t, tpr, t-tye, jsjodin, jlebar, resistor, simoll, mehdi_amini, __simt__.

GPU-oriented programming languages have some operations with constraints
that cannot currently be expressed properly in LLVM IR. For example:

  uvec4 result;
  if (cc) {
    result = ballot(true);
  } else {
    result = ballot(true);
  }

Even though both sides of the branch are identical, it is incorrect to
replace the if-statement with a single ballot call. This is because
`ballot` communicates with other threads, and the set of those threads
depends on where `ballot` is with respect to control flow.

In the past, we have tried to fix this up somewhat by putting the
`convergent` attribute on functions. However, this approach has some
weaknesses. First, the restrictions imposed by `convergent` are not
actually strong enough for some cases such as the example above. Second,
the definition of `convergent` relies on the notion of
control-dependencies, which have action at a distance that makes it
difficult to satisfy. For example, the jump threading pass currently
does not honor the `convergent` attribute correctly in cases
such as:

  bool flag = false;
  if (cc1) {
    ...
    if (cc2)
      flag = true;
  }
  if (flag) {
    result = ballot(true);
  }

Since the convergent ballot operation is at a distance from the part
of the code inspected by the jump threading pass, the pass will decide
to transform the code in an incorrect way.

This patch proposes to fix these and related problems by putting the
convergent attribute and the underlying notions of divergence and
reconvergence on a solid formal basis. At the same time, the impact
on generic transforms is small by design: a new set of intrinsics is
introduced that can be used to control reconvergence without being
prone to action at a distance. Frontends for GPU-oriented programming
langauges are expected to insert these intrinsics, so that passes such
as jump threading will be "correct by default".

In the jump threading example above, a frontend would be expected to
insert intrinsics as follows:

  bool flag = false;
  token tok = @llvm.convergence.anchor();
  if (cc1) {
    ...
    if (cc2)
      flag = true;
  }
  @llvm.convergence.join(tok);
  if (flag) {
    result = ballot(true);
  }

The convergence intrinsics indicate that threads are expected to
reconverge before the second if-statement, which affects the behavior
of the ballot call. The join intrinsic call guards against incorrect
jump threading.

The intention of this RFC is to gauge the interest of the LLVM community
and whether this direction can be accepted going forward. Frontend and
backend parts are required for a complete solution, though the frontend
parts are language-specific and therefore not part of LLVM itself.

Additional Notes:

- Function inlining really needs to add convergence intrinsics when the caller is convergent and the callee contains control flow


Repository:
  rG LLVM Github Monorepo

https://reviews.llvm.org/D68994

Files:
  llvm/docs/DynamicInstances.rst
  llvm/docs/LangRef.rst
  llvm/docs/Reference.rst
  llvm/include/llvm/IR/Intrinsics.td
  llvm/lib/Transforms/Scalar/LoopUnrollPass.cpp
  llvm/lib/Transforms/Utils/SimplifyCFG.cpp
  llvm/test/Transforms/JumpThreading/basic.ll
  llvm/test/Transforms/SimplifyCFG/attr-convergent.ll

-------------- next part --------------
A non-text attachment was scrubbed...
Name: D68994.225051.patch
Type: text/x-patch
Size: 21817 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20191015/2edbcb1e/attachment.bin>


More information about the llvm-commits mailing list