[PATCH] D35167: [AMDGPU] Add an llvm.amdgcn.wqm intrinsic for WQM

Nicolai Hähnle via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Thu Jul 20 08:45:45 PDT 2017


nhaehnle requested changes to this revision.
nhaehnle added a comment.
This revision now requires changes to proceed.

Some minor comments. In addition, I think it can be simplified, and we //probably// want the intrinsic to be convergent, because sinking WQM computations into a non-uniform branch could mean that the computation becomes non-WQM for practical purposes.



================
Comment at: include/llvm/IR/IntrinsicsAMDGPU.td:744-748
+// Copies the source value to the destination value, with the guarantee that
+// the source value is computed as if the entire program were executed in WQM.
+def int_amdgcn_wqm : Intrinsic<[llvm_any_ty],
+  [LLVMMatchType<0>], [IntrNoMem, IntrSpeculatable]
+>;
----------------
I believe this should be convergent, due to the way neighboring lanes may disappear due to control flow.


================
Comment at: lib/Target/AMDGPU/SIInstructions.td:121
+// after the WQM pass processes them.
+def WQM : PseudoInstSI <(outs unknown:$vdst), (ins unknown:$src0)>;
+
----------------
I believe this should have a let isConvergent = 1, due to the way neighboring lanes could "disappear" with additional control flow.


================
Comment at: lib/Target/AMDGPU/SIWholeQuadMode.cpp:300
+      } else if (Opcode == AMDGPU::WQM) {
+        // the WQM intrinsic requires its output to have all the helper lanes
+        // correct, so we need it to be in WQM.
----------------
Capitalize the comment.


================
Comment at: lib/Target/AMDGPU/SIWholeQuadMode.cpp:676-688
+void SIWholeQuadMode::lowerCopyInstrs() {
+  for (MachineInstr *MI : LowerToCopyInstrs) {
+    const DebugLoc &DL = MI->getDebugLoc();
+    unsigned Dest = MI->getOperand(0).getReg();
+    unsigned Src = MI->getOperand(1).getReg();
+    MachineInstr *Copy =
+        BuildMI(*MI->getParent(), MI, DL, TII->get(AMDGPU::COPY), Dest)
----------------
You can probably use MI->setDesc for this.


https://reviews.llvm.org/D35167





More information about the llvm-commits mailing list