[PATCH] D35167: [AMDGPU] Add an llvm.amdgcn.wqm intrinsic for WQM

Nicolai Hähnle via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Wed Aug 2 02:15:20 PDT 2017


nhaehnle accepted this revision.
nhaehnle added a comment.
This revision is now accepted and ready to land.

Thanks. Assuming we agree on the derivative calculations logic I added in a comment, this LGTM.



================
Comment at: include/llvm/IR/IntrinsicsAMDGPU.td:744-748
+// Copies the source value to the destination value, with the guarantee that
+// the source value is computed as if the entire program were executed in WQM.
+def int_amdgcn_wqm : Intrinsic<[llvm_any_ty],
+  [LLVMMatchType<0>], [IntrNoMem, IntrSpeculatable]
+>;
----------------
cwabbott wrote:
> nhaehnle wrote:
> > I believe this should be convergent, due to the way neighboring lanes may disappear due to control flow.
> Hmm, I'm not really convinced. All this intrinsic does is to guarantee something about how its source value is computed, which obviously won't change if the instruction itself is moved around. The operation itself is a simple move operation, which normally isn't convergent. Can you give me an example where adding a control dependency to the WQM intrinsic causes problems?
How about derivative calculations where the result is only used in a subsequent if-block? Basically, we need to prevent
```
%deriv.0 = derivative calculations
%deriv = llvm.amdgcn.wqm(%deriv.0)

if (cc) {
   only_use_of(%deriv)
}
```
being sunk into
```
if (cc) {
  %deriv.0 = derivative calculations
  %deriv = llvm.amdgcn.wqm(%deriv.0)
  only_use_of(%deriv)
}
```
Although, on second thought, I guess all the cross-lane operations involved in computing %deriv.0 are already convergent? So I guess it's fine in the end...


https://reviews.llvm.org/D35167





More information about the llvm-commits mailing list