[LLVMdev] Proposal: Debug information improvement - keep the line number with optimizations

Mon Feb 2 23:50:19 PST 2009

Hi Patel,

Here is second part of my reply.
> 2.      Proposed Work Plan
> > This section defines a proposed work plan to accomplish the  
> > requirements that we desires. The work plan is broken into several  
> > distinct phases that follow a logical progression of modifications  
> > to the LLVM software.
> >
> > 2.1   Phase 1: Establish the testing system
> > One of the most useful things to get started is to have some way to  
> > determine whether codegen is being impacted by debug info.  It is  
> > important to be able to tell when this happens so that we can track  
> > down these places and fix them.
> >
> > 2.1.1    Pass Scanning Script
> > Following the way proposed by Chris, it is good to have a script to  
> > scan the standard LLVM transform pass list. We can get the standard  
> > compile optimization pass list by:
>   
>
> You can use http://llvm.org/docs/SourceLevelDebugging.html#debugopt as  
> a starting point here.
>   

Ok.
>> >
>> >        $ opt -std-compile-opts -debug-pass=Arguments foo.bc > /dev/ 
>> > null
...
...
> 2.2   Phase 2: New Pass to Strip Debug Information
> > LLVM already has a transform pass "-strip-debug", it removes all the  
> > debug information. But for the first half of this project, we want  
> > to just keep the line number information (stop point) in the  
> > optimized code. So we need a new transform pass to just removes the  
> > variable declaration information. Pass "-strip-debug" also doesn't  
> > cleanup the dead variable and function calling for debug  
> > information, it thinks other pass like "-dce" or "-globaldce" can  
> > handle this. But as we are also going to update those passes, we  
> > can't use them in the verification flow, otherwise, it may output  
> > incorrect check results.
> >
> > The new pass "-strip-debug-pro" should have the following functions:
> > 1.         Just remove the variable declaration information and  
> > clean up the dead debug information.
> >
> > 2.         Remove all the debug information and clean up
> >
> > 3.2.1    Work Plan
> > 1.         Take a reference to transform pass StripSymbol.cpp
> > 2.         Based on the StripSymbol.cpp, add an option to it to just  
> > remove debug information, like "-rm-debug"
>   
>
> That's what -strip-debug is doing.
>
>   
>> > 3.         Add an option to just remove the variable declaration  
>> > information, like "?rm-debug=2"
>>     
>
> Why not -strip-debug=2 if you want a way to remove variable  
> declarations ..?
>   
Agree.
>   
>> > 4.         Add a procedure to clean up the dead variables and  
>> > function calls for debug purpose.
>> >
>> > 2.3   Phase 3: Extend llvm-gcc
>> > Once we have a way to verify what is happening, I propose that we  
>> > aim for an intermediate point: instead of having -O disable all  
>> > debug info, we should make it disable just variable information, but  
>> > keep emitting line number info.  This would allow stepping through  
>> > the program, getting stack traces, use performance tools like shark,  
>> > etc.
>> >
>> > We need the front-end llvm-gcc to have a mode that causes it to emit  
>> > line number info but not
>> > variable info, we can go through the process above to identify  
>> > passes that change behavior when line number intrinsics are in the  
>> > code.
>> >
>> > 1.3.1    Work Plan
>> > 1.         First locate the file position that llvm-gcc handle the  
>> > parameter options.
>> > 2.         Add a new option to control the llvm-gcc to emit  
>> > specified debug information: like ?g1. ?g1 to only emit line number
>>     
>
>   
>> > 3.         Building the new llvm-gcc
>> > 4.         Testing through llvm/test, llvm-test
>> >
>> > 2.4   Phase 4: Update Transform Passes for Line Number Info.
>> > When the front-end has a mode that causes it to emit line number  
>> > info but not variable info, we can go through the process above to  
>> > identify passes that change behavior when line number intrinsics are  
>> > in the code.
>>     
>
> I think, the optimizer is not changing behavior when dbg info is  
> present. Try running dbgopt tests.
>
>   
>> >   Obvious cases are things like loop unroll and inlining: they  
>> > 'measure' the size of some code to determine whether to unroll it or  
>> > not. This means that it should be enhanced to ignore debug  
>> > intrinsics for the sake of code size estimation.
>>     
>
> The loop unrolling pass already ignores the debug info! See  
> LoopUnroll.cpp::ApproximateLoopSize()
>   
ok, I see.
>   
>> >
>> > Another example is optimizations like SimplifyCFG when it merges if/ 
>> > then/else into select instructions. SimplifyCFG will have to be  
>> > enhanced to ignore debug intrinsics when doing its safety/ 
>> > profitability analysis,
>>     
>
> I think, it handles this part well, but ...
>   
>> > but then it will also have to be updated to just delete the line  
>> > number intrinsics when it does the xform. This is simplifycfg's way  
>> > of "updating" the debug info for this example transformation.
>>     
>
> .. the second part has not received full attention.
>
>   
>> > As we progress through various optimizations, we will find cases  
>> > where it is possible to update (e.g. loop unroll or inlining, which  
>> > doesn't have to do anything special to update line #'s) and places  
>> > where it isn't.  As long as the debug intrinsics don't affect  
>> > codegen, we are happy, even if the debug intrinsics are deleted in  
>> > cases where it would be possible to update them (this becomes a  
>> > optimized debugging QoI issue).