[PATCH] D17454: Insert nop for each high level source statement

Tom Stellard via llvm-commits llvm-commits at lists.llvm.org
Fri Feb 19 13:08:24 PST 2016


tstellarAMD added inline comments.

================
Comment at: lib/Target/AMDGPU/AMDGPUInsertNopsPass.cpp:11-19
@@ +10,11 @@
+/// \file
+/// These passes insert S_NOP instruction for each high level source statement.
+/// AMDGPUInsertDebugNops pass inserts DEBUG_NOP pseudo instructions before
+/// register allocation. AMDGPULowerDebugNops pass lowers DEBUG_NOP instructions
+/// to S_NOP instructions before machine code is emitted.
+///
+/// S_NOP for each high level source statement is needed for tools (i.e.
+/// debugger, profiler), which overwrite S_NOPs with S_TRAPs as they see fit.
+//
+//===----------------------------------------------------------------------===//
+#include "AMDGPU.h"
----------------
kzhuravl wrote:
> tstellarAMD wrote:
> > Why do we need two passes.  Can't we just insert the S_NOP instructions in the first pass?
> this should work with different optimization levels. for o0 one pass works fine. in other opt levels instructions are reordered at different compilation stages. first pass inserts DEBUG_NOP pseudo instructions before register allocation. DEBUG_NOP pseudo instruction has isTerminator attribute, which makes reordering across DEBUG_NOPs not possible. second pass lowers DEBUG_NOPs to S_NOPs right before machine code is emitted.
By the time we get to running the first pass, the code will have already been re-ordered by the LLVM IR passes as well as the SelectionDAG.  We also can't insert instructions with terminators in the middle of blocks, because this will break other passes (and the verifier).

Can we start with one pass and if the result isn't good enough then maybe look for other solutions? 

================
Comment at: lib/Target/AMDGPU/AMDGPUTargetMachine.cpp:272
@@ +271,3 @@
+void AMDGPUPassConfig::addPreRegAlloc() {
+  if (InsertNops) {
+    addPass(createAMDGPUInsertDebugNopsPass());
----------------
kzhuravl wrote:
> tstellarAMD wrote:
> > We should also be running this pass if the user specifies -g.
> tools team specifically asked not to do this. for example, for profiling they want to run it with -g, but without inserting nops
Ok, so the nops are required for both debugging and profiling?  Is the command line flag you added accessible from clang?


http://reviews.llvm.org/D17454





More information about the llvm-commits mailing list