[cfe-dev] RFC: Remove uninteresting debug locations at -O0

Thu Apr 30 20:30:19 PDT 2020

> On Apr 29, 2020, at 2:52 PM, Reid Kleckner <rnk at google.com> wrote:
> 
> Sure, I can try to elaborate, but I am assuming a bit about what the users desired stepping behavior is. I am assuming in this case that the user is doing the equivalent of `s` or `n` in gdb, and they want to get from one semicolon to the next. If that is not the case, if the user wants the debugger to stop at the location of both getFoo() calls in your example, my suggestion isn't helpful.

So instead of moving forward on a by-line basis, "n" would move forward to the next source-level statement?

Is the motivation to make larger steps, such as in:

  f(var1,
    var2,
    var3);
  g(var4)

having "n" jump from the f-call directlty to the g-call? Or are you concerned about code that has more than one statement per line? I guess the latter is very rare in LLVM code at least.

> However, since the last dev meeting, I have been thinking about having IR intrinsics that mark the IR position where a statement begins. The intrinsic would carry a source location. The next instruction emitted with that source location would carry the DWARF is_stmt line table flag.

What I'm personally most interested in, is making smaller steps, such as expression-based stepping, i.e.:

f(a(), b(), c())

being able to "n" from the a-call, to the b-call. For that the current line info would probably be good enough because we already emit different column locations for all subexpressions.

> Making this idea work with optimizations is harder, since the instructions that make up a statement may all be removed or hoisted. To make this work, we would have to establish which instructions properly belong to the statement. This could be done by adding a level to the DIScope hierarchy. The statement markers would remain in place, and code motion would happen around them. Late in the codegen pipeline, the first instruction belonging to the most recently activated statement will be emitted with the is_stmt flag.

The main problem I see with optimized code in this respect is when instructions get merged, hoisted, and otherwise moved or reused. I'd be curious to see some examples for how to handle loop-invariant code motion, DAGCombine patterns, in that approach.

-- adrian

> 
> On Wed, Apr 29, 2020 at 9:36 AM Adrian Prantl <aprantl at apple.com <mailto:aprantl at apple.com>> wrote:
> Thanks to all of you for sharing your perspective! You all brought up important aspects that I hadn't considered. The argument that really clicked for me was Pavel's that you might want to change the value before the load happens.
> 
> One of my worries here is that what I called "less interesting" locations might delete more interesting ones when instructions and their locations are merged. But if we indeed consider the stack slot loads to be interesting then that is really the best we can do. After all, storing more than source location per instruction would be a UI design nightmare for a debugger.
> 
> 
> Reid, I would like to learn more about what you mean by statement tracking. I'm thinking of a statements as the expressions separated by semicolons, but since my whole example was a single statement I'm assuming you have something more fine-grained in mind. Perhaps you can post an example to illustrate what you have in mind?
> 
> 
> thanks!
> -- adrian

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/cfe-dev/attachments/20200430/6f764783/attachment.html>