[PATCH] D79481: [AMDGPU] Cluster shader exports

Wed May 6 05:20:39 PDT 2020

foad accepted this revision.
foad added a comment.
This revision is now accepted and ready to land.

Looks good.

As a follow-on, would this be a good place to try to ensure that position exports go before parameter exports?

================
Comment at: llvm/lib/Target/AMDGPU/AMDGPUExportClustering.cpp:31-32
+  const MachineInstr *MI = SU.getInstr();
+  return (MI->getOpcode() == AMDGPU::EXP) ||
+         (MI->getOpcode() == AMDGPU::EXP_DONE);
+}
----------------
You don't need these parentheses.

================
Comment at: llvm/lib/Target/AMDGPU/AMDGPUExportClustering.cpp:39
+  SUnit *ChainHead = Exports[0];
+  for (unsigned Idx = 0, End = Exports.size(); Idx < (End - 1); ++Idx) {
+    SUnit *SUa = Exports[Idx];
----------------
This is trivia, but isn't it simpler to start counting at 1 than to finish at (End - 1) ?

================
Comment at: llvm/lib/Target/AMDGPU/AMDGPUExportClustering.cpp:45
+        SUnit *PredSU = Pred.getSUnit();
+        // Assumes that nothing will be dependent on an export instruction.
+        if (Pred.isWeak() || isExport(*PredSU))
----------------
I don't quite understand the comment. If nothing depends on an export then surely isExport() on the next line will never be true?

================
Comment at: llvm/lib/Target/AMDGPU/AMDGPUExportClustering.cpp:58-78
+  // Build chains of exports
+  for (SUnit &SU : DAG->SUnits) {
+    if (!isExport(SU))
+      continue;
+
+    unsigned ChainID = ExportChains.size();
+    for (const SDep &Pred : SU.Preds) {
----------------
Do you understand the chain formation logic, or is it copied from the standard MemOpClusterMutation? (Or both?!)

================
Comment at: llvm/test/CodeGen/AMDGPU/llvm.amdgcn.exp.ll:552-558
+define amdgpu_kernel void @test_export_clustering(float %x, float %y) #0 {
+  %z0 = fadd float %x, %y
+  call void @llvm.amdgcn.exp.f32(i32 32, i32 15, float %x, float %y, float %z0, float 0.0, i1 false, i1 false)
+  %z1 = fsub float %y, %x
+  call void @llvm.amdgcn.exp.f32(i32 33, i32 15, float %x, float %y, float %z1, float 1.0, i1 true, i1 false)
+  ret void
+}
----------------
Can you pre-commit this? Did it actually have a waitcnt before this fix?

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D79481/new/

https://reviews.llvm.org/D79481