[PATCH] D17454: Insert nop for each high level source statement

Fri Feb 19 11:10:29 PST 2016

kzhuravl added a comment.

tools want to selectively substitute S_NOPs with S_TRAPs based on the user input

================
Comment at: lib/Target/AMDGPU/AMDGPUInsertNopsPass.cpp:1
@@ +1,2 @@
+//===-- AMDGPUAlwaysInlinePass.cpp - Promote Allocas ----------------------===//
+//
----------------
tstellarAMD wrote:
> The file name in the header is wrong, but I think the file and the class names should be renamed to use the 'SI' prefix rather than the 'AMDGPU' prefix to be consistent with other GCN only passes.
ok

================
Comment at: lib/Target/AMDGPU/AMDGPUInsertNopsPass.cpp:11-19
@@ +10,11 @@
+/// \file
+/// These passes insert S_NOP instruction for each high level source statement.
+/// AMDGPUInsertDebugNops pass inserts DEBUG_NOP pseudo instructions before
+/// register allocation. AMDGPULowerDebugNops pass lowers DEBUG_NOP instructions
+/// to S_NOP instructions before machine code is emitted.
+///
+/// S_NOP for each high level source statement is needed for tools (i.e.
+/// debugger, profiler), which overwrite S_NOPs with S_TRAPs as they see fit.
+//
+//===----------------------------------------------------------------------===//
+#include "AMDGPU.h"
----------------
tstellarAMD wrote:
> Why do we need two passes.  Can't we just insert the S_NOP instructions in the first pass?
this should work with different optimization levels. for o0 one pass works fine. in other opt levels instructions are reordered at different compilation stages. first pass inserts DEBUG_NOP pseudo instructions before register allocation. DEBUG_NOP pseudo instruction has isTerminator attribute, which makes reordering across DEBUG_NOPs not possible. second pass lowers DEBUG_NOPs to S_NOPs right before machine code is emitted.

================
Comment at: lib/Target/AMDGPU/AMDGPUInsertNopsPass.cpp:122
@@ +121,3 @@
+  bool modified = false;
+  for (auto &CMB : MF) {
+    auto CMI = CMB.begin();
----------------
tstellarAMD wrote:
> Style comment.  We usually use MBB as a variable name when iterating over blocks and MI when iterating over Machine Instructions.  This will make it more obvious what the code is doing.
ok

================
Comment at: lib/Target/AMDGPU/AMDGPUTargetMachine.cpp:272
@@ +271,3 @@
+void AMDGPUPassConfig::addPreRegAlloc() {
+  if (InsertNops) {
+    addPass(createAMDGPUInsertDebugNopsPass());
----------------
tstellarAMD wrote:
> We should also be running this pass if the user specifies -g.
tools team specifically asked not to do this. for example, for profiling they want to run it with -g, but without inserting nops

================
Comment at: lib/Target/AMDGPU/AMDGPUTargetMachine.cpp:273
@@ +272,3 @@
+  if (InsertNops) {
+    addPass(createAMDGPUInsertDebugNopsPass());
+  }
----------------
tstellarAMD wrote:
> This should be added to GCNPassConfig since it is GCN only.
ok

http://reviews.llvm.org/D17454