[PATCH] D35967: [AMDGPU] Collapse adjacent SI_END_CF

Thu Jul 27 19:14:57 PDT 2017

arsenm added inline comments.

================
Comment at: lib/Target/AMDGPU/SIOptimizeExecMaskingPreRA.cpp:82
+    break;
+  case AMDGPU::S_MOV_B64:
+  case AMDGPU::COPY:
----------------
I would hope we aren't seeing s_mov_b64s with register inputs at this point. Does this actually happen?

================
Comment at: lib/Target/AMDGPU/SIOptimizeExecMaskingPreRA.cpp:114
+
+bool SIOptimizeExecMaskingPreRA::runOnMachineFunction(MachineFunction &MF) {
+  const SISubtarget &ST = MF.getSubtarget<SISubtarget>();
----------------
Missing skipFunction()

================
Comment at: lib/Target/AMDGPU/SIOptimizeExecMaskingPreRA.cpp:129
+    for ( ; I != E; ++I) {
+      if (!TII->isSALU(*I) || I->readsRegister(AMDGPU::EXEC, TRI) ||
+          I->isBranch())
----------------
isBranch check first. I'm not sure why this needs to specifically skip branches though

================
Comment at: lib/Target/AMDGPU/SIOptimizeExecMaskingPreRA.cpp:153-155
+    LIS->removeRegUnit(*MCRegUnitIterator(AMDGPU::EXEC_LO, TRI));
+    LIS->removeRegUnit(*MCRegUnitIterator(AMDGPU::EXEC_HI, TRI));
+
----------------
Since you don't seem to be using LIS for anything, could you move this out of the loop so all of the updates are done at once after you're done modifying the uses?

https://reviews.llvm.org/D35967