[PATCH] D31161: [AMDGPU] New Waitcnt Insertion Pass

Matt Arsenault via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Tue Mar 28 14:39:27 PDT 2017


arsenm added inline comments.


================
Comment at: lib/Target/AMDGPU/SIInsertWaitcnts.cpp:2
+//===-- SIInsertWaitcnts.cpp - Insert Wait Instructions --------------------===/
+
+//
----------------
Extra blank line


================
Comment at: lib/Target/AMDGPU/SIInsertWaitcnts.cpp:88-91
+#define ForAllWaitEventType(w)                                                 \
+  for (enum WaitEventType w = (enum WaitEventType)0;                           \
+       (w) < (enum WaitEventType)NUM_WAIT_EVENTS;                              \
+       (w) = (enum WaitEventType)((w) + 1))
----------------
This should be replaced with an iterator range or simply removed


================
Comment at: lib/Target/AMDGPU/SIInsertWaitcnts.cpp:510-515
+      } else if (AMDGPU::getAtomicNoRetOp(Inst.getOpcode()) != -1) {
+        setExpScore(
+            &Inst, TII, TRI, MRI,
+            AMDGPU::getNamedOperandIdx(Inst.getOpcode(), AMDGPU::OpName::data),
+            CurrScore);
+      }
----------------
Can't you just get the ::data operand and check if it is a use or def? Same for the other places checking getAtomicNoRetOp


================
Comment at: lib/Target/AMDGPU/SIInsertWaitcnts.cpp:760-762
+/**
+ *  @brief Generate s_waitcnt instruction to be placed before cur_Inst.
+ *
----------------
C++ style comments


================
Comment at: lib/Target/AMDGPU/SIInsertWaitcnts.cpp:779
+  // s_waitcnt instruction to return; default is NULL.
+  MachineInstr *SWaitInst = NULL;
+  // No need to wait before phi. If a phi-move exists, then the wait should
----------------
nullptr


================
Comment at: lib/Target/AMDGPU/SIInsertWaitcnts.cpp:785
+  // instructions that are skipped in the assembling loop.
+  bool NeedLinemapping = false; // TODO: Check on this.
+  if (MI.getOpcode() == AMDGPU::DBG_VALUE &&
----------------
Linemapping->LineMapping


================
Comment at: lib/Target/AMDGPU/SIInsertWaitcnts.cpp:786
+  bool NeedLinemapping = false; // TODO: Check on this.
+  if (MI.getOpcode() == AMDGPU::DBG_VALUE &&
+      // TODO: any other opcode?
----------------
MI.isDebugValue()


================
Comment at: lib/Target/AMDGPU/SIInsertWaitcnts.cpp:929
+    // instruction.
+    for (const MachineMemOperand *Memop : MI.memoperands()) {
+      unsigned AS = Memop->getAddrSpace();
----------------
I'm concerned by relying on the memoperands since it's possible they were dropped. The uses LDS memory check could at least be factored into a predicate function


================
Comment at: lib/Target/AMDGPU/SIInsertWaitcnts.cpp:968-971
+      for (const MachineMemOperand *Memop : MI.memoperands()) {
+        unsigned AS = Memop->getAddrSpace();
+        if (AS != AMDGPUAS::LOCAL_ADDRESS)
+          continue;
----------------
Ditto


================
Comment at: lib/Target/AMDGPU/SIInsertWaitcnts.cpp:1140-1146
+    bool FoundLDS = false;
+    for (const MachineMemOperand *Memop : Inst.memoperands()) {
+      unsigned AS = Memop->getAddrSpace();
+      if (AS == AMDGPUAS::LOCAL_ADDRESS) {
+        FoundLDS = true;
+        break;
+      }
----------------
This should check the gds operand, not the mem operands. This also needs MIR tests since we don't emit GDS operations now


================
Comment at: lib/Target/AMDGPU/SIInsertWaitcnts.cpp:1193
+    case AMDGPU::EXP_DONE: {
+      int Imm = Inst.getOperand(0).getImm();
+      if (Imm >= 32 && Imm <= 63)
----------------
getNamedOperand


================
Comment at: lib/Target/AMDGPU/SIInsertWaitcnts.cpp:1608-1609
+      // This avoids a s_nop after a waitcnt has just been inserted.
+      if (!SWaitInst && InsertNOP)
+        BuildMI(Block, Inst, DebugLoc(), TII->get(AMDGPU::S_NOP)).addImm(0);
+      InsertNOP = false;
----------------
Braces around the block and addImm on next line


================
Comment at: lib/Target/AMDGPU/SIInsertWaitcnts.cpp:1687
+        if (!SWaitInst) {
+          SWaitInst = Block.getParent()->CreateMachineInstr(
+              TII->get(AMDGPU::S_WAITCNT), DebugLoc());
----------------
Use BuildMI rather than the low level instruction creation APIs


https://reviews.llvm.org/D31161





More information about the llvm-commits mailing list