[PATCH] D17454: Insert two S_NOP instructions for every high level source statement.

Konstantin Zhuravlyov via llvm-commits llvm-commits at lists.llvm.org
Sun Feb 21 09:34:47 PST 2016


kzhuravl marked an inline comment as done.
kzhuravl added a comment.

Tom's feedback


================
Comment at: lib/Target/AMDGPU/AMDGPUInsertNopsPass.cpp:11-19
@@ +10,11 @@
+/// \file
+/// These passes insert S_NOP instruction for each high level source statement.
+/// AMDGPUInsertDebugNops pass inserts DEBUG_NOP pseudo instructions before
+/// register allocation. AMDGPULowerDebugNops pass lowers DEBUG_NOP instructions
+/// to S_NOP instructions before machine code is emitted.
+///
+/// S_NOP for each high level source statement is needed for tools (i.e.
+/// debugger, profiler), which overwrite S_NOPs with S_TRAPs as they see fit.
+//
+//===----------------------------------------------------------------------===//
+#include "AMDGPU.h"
----------------
tstellarAMD wrote:
> kzhuravl wrote:
> > tstellarAMD wrote:
> > > Why do we need two passes.  Can't we just insert the S_NOP instructions in the first pass?
> > this should work with different optimization levels. for o0 one pass works fine. in other opt levels instructions are reordered at different compilation stages. first pass inserts DEBUG_NOP pseudo instructions before register allocation. DEBUG_NOP pseudo instruction has isTerminator attribute, which makes reordering across DEBUG_NOPs not possible. second pass lowers DEBUG_NOPs to S_NOPs right before machine code is emitted.
> By the time we get to running the first pass, the code will have already been re-ordered by the LLVM IR passes as well as the SelectionDAG.  We also can't insert instructions with terminators in the middle of blocks, because this will break other passes (and the verifier).
> 
> Can we start with one pass and if the result isn't good enough then maybe look for other solutions? 
After discussion with Tools, it was decided to insert two S_NOPs for each high level source statement, this way we do not have to disable any optimizations in non-O0 opt levels. One S_NOP is inserted before first isa instruction of high level source stmt and after last isa instruction of high level source stmt. Updated the diff which includes one pass


http://reviews.llvm.org/D17454





More information about the llvm-commits mailing list