<html><head><meta http-equiv="Content-Type" content="text/html; charset=utf-8"></head><body style="word-wrap: break-word; -webkit-nbsp-mode: space; line-break: after-white-space;" class="">Hi Dean,<div class=""><br class=""></div><div class="">Thank you for the detailed post - I agree that this is the right approach.  This is what we converged on in the MLIR IR as well for basically the same reasons you listed.</div><div class=""><br class=""></div><div class="">I think it would be great to consider porting the MLIR design over the LLVM IR.</div><div class=""><br class=""></div><div class="">-Chris<br class=""><div><br class=""><blockquote type="cite" class=""><div class="">On May 22, 2020, at 4:13 PM, Dean Sutherland via cfe-dev <<a href="mailto:cfe-dev@lists.llvm.org" class="">cfe-dev@lists.llvm.org</a>> wrote:</div><br class="Apple-interchange-newline"><div class=""><div dir="ltr" class="">If you intend to manage source locations for an optimizing compiler, you need a much richer data structure than Clang & LLVM currently support. The compiler company I worked for went down this road ~25 years ago. It's actually quite straightforward to build, albeit potentially expensive in terms of memory and compilation time. But the result is that you can support accurate debugging of fully-optimized code. The model we used didn't even try to represent any source location smaller than a line of code, because we had waaay less memory available back then. The approach we used can reasonably apply to any granularity of source location you choose to implement. That said:<div class=""><ul class=""><li class="">Any given instruction may "originate from" an arbitrary number of distinct source locations (call them LOCs). Consider inline expansion of a method, or macro expansion as two obvious examples; there are many more. The resulting instructions "originate from" the location of the call (or macro) <i class="">and also</i> from the particular line(s) in the method (macro) that generated them. So we need to be able to map one instruction/program-address to many LOCs.</li><li class="">Any given line of code may be implemented by a bunch of instructions that may or may not be contiguous. So we need to be able to map a single LOC to multiple instructions, ranges of instructions, multiple non-contiguous ranges of instructions, etc.</li><li class="">Depending on later optimization, there could legitimately be multiple distinct instructions <i class="">each of which</i> is "a beginning" of that LOC. More multiplicity.</li><li class="">Depending on later optimization, there could be multiple distinct <i class="">instances</i> of a single LOC. Consider a method that is inline-expanded at K call sites.  Each LOC within that method has (at least) K instances. Each maps back to the same LOC, but the instances aren't the same as each other. This is an important and possibly non-obvious part of the design. Loop unrolling is another example; hoisting an invariant out of a loop is yet another.</li></ul><div class="">We found by experience that attempting to maintain beginnings (and endings) of LOC instances through the entire compiler was too hard; we always messed it up. So what we did instead was to tag each syntax tree (or Optimizer IR node, or instruction, etc.) with <i class="">all</i> the LOC instances that it's part of.  We did some quite simple LOC management (see below) throughout IR generation, optimization, and code generation, then computed the beginnings, ends, and ranges of LOC instances at the end of the compilation process via the obvious flow-based algorithms.</div><div class=""><br class=""></div><div class="">Tagging everything with all its LOC info and recovering beginnings/ends after all transformations are complete makes managing source location data inside the compiler essentially trivial. It devolves into five simple cases:<br class=""><ol class=""><li class="">When you delete code/IR/instructions (I'll call all of these <i class="">code</i> from now on), delete the LOC data along with them. With a reasonable implementation this comes for free when deleting the code.</li><li class="">When moving code, move the LOC data along with it/them. Also a freebie.</li><li class="">When combining code, union the LOC info. This sometimes happens in combination with the next case.</li><li class="">When creating new code (simple case) copy the LOC info from the origin from which it was created onto the new code.</li><li class="">When replicating code (more complex case) — inline expansion, loop unrolling, macro expansion... — create a new <i class="">instance</i> of each LOC for each replication of the origin. New instances are treated as distinct lines of code for the purpose of computing begin/end locations. When setting a breakpoint at the beginning of line N, for example, you would get breakpoint(s) at each beginning of each distinct instance of line N. For fancier debug support, one could set breakpoints at particular instances of line N without stopping at other instances, and so on.</li></ol></div></div><div class="">The pain of implementing such an approach in the compiler is that somebody has to visit more or less the entire code base to ensure that each thing that transforms code applies the correct choice from the above 5 cases. The good news is that each such decision is both local and essentially trivial — we finished the first cut at the update in our compilers' code (~400KLOC at the time) in about two person-weeks of effort. Followed, of course, by testing & debugging time to smoke out the incorrect or missing decisions. </div><div class=""><br class=""></div><div class="">The result of this approach is that you get quality source location information for minimum-to-low optimization levels at reasonable expense in memory & compile-time and with only modest expansion of the size of the debug info. At higher optimization levels you continue to get fully-accurate source location information (modulo bugs, as with all things). Depending on how agressive your optimizations are, that location information may be anything from somewhat non-obvious to utterly brain-bending. The increase in size for the debug info can be large, but is roughly proportional to the degree to which code has been transformed.</div><div class=""><br class=""></div><div class="">You can propagate LOCs through essentially arbitrary optimizations: loop unrolling, vectorization, loop pipelining, cache blocking, or whatever. You can even track source locations through software loop pipelined, unrolled & optimized VLIW code, if that happens to be the problem at hand. As an added benefit, it supports things like providing auditors looking at safety-critical code with the exact origin(s) of every instruction in the program. It even helps compiler implementers understand what their optimizer has done to the program.</div><div class=""><br class=""></div><div class="">Obviously, your debugger has to understand the more complicated model of source locations. For example, asking for a breakpoint at the beginning of a source line may require setting either one or many breakpoints in the program.  Similarly, if you stop at an arbitrary instruction and ask the debugger where you are in the source, you could wind up with something like "Stopped at the beginning of line-K-instance-X, somewhere during execution of line-L-instance-1, and just after the end of line-A". The most useful breakpoint operations were: "stop at the beginning of (each instance of) a line" and "stop just after the last instruction of (each instance of) a line", because those are the points where either nothing for that line has yet executed or where everything for that line is complete. Obviously, arbitrary other stuff could be in flight there as well, including prior or subsequent lines, or whatever.</div><div class=""><br class=""></div><div class="">We also implemented a way to accurately explain to the debugger how to find the value of variables post optimization, but that'd need another big wall of text so I'll skip it for now.</div><div class=""><br class=""></div><div class="">Sorry to go on so long. My primary message is that propagating rich source location information through a compiler is a problem that is conceptually easy, is tedious but feasible to implement, and that has known proven solutions. Further details on request, if anyone is interested.</div><div class=""><br class=""></div><div class="">Dean</div><div class=""><br class=""></div></div><br class=""><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Wed, Apr 29, 2020 at 2:59 PM David Blaikie via cfe-dev <<a href="mailto:cfe-dev@lists.llvm.org" target="_blank" class="">cfe-dev@lists.llvm.org</a>> wrote:<br class=""></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div dir="ltr" class=""><div dir="ltr" class=""><br class=""></div><br class=""><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Wed, Apr 29, 2020 at 2:52 PM Reid Kleckner <<a href="mailto:rnk@google.com" target="_blank" class="">rnk@google.com</a>> wrote:<br class=""></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div dir="ltr" class="">Sure, I can try to elaborate, but I am assuming a bit about what the users desired stepping behavior is. I am assuming in this case that the user is doing the equivalent of `s` or `n` in gdb, and they want to get from one semicolon to the next. If that is not the case, if the user wants the debugger to stop at the location of both getFoo() calls in your example, my suggestion isn't helpful.<div class=""><br class=""></div><div class="">However, since the last dev meeting, I have been thinking about having IR intrinsics that mark the IR position where a statement begins. The intrinsic would carry a source location. The next instruction emitted with that source location would carry the DWARF is_stmt line table flag.</div><div class=""><br class=""></div><div class="">Making this idea work with optimizations is harder, since the instructions that make up a statement may all be removed or hoisted. To make this work, we would have to establish which instructions properly belong to the statement. This could be done by adding a level to the DIScope hierarchy. The statement markers would remain in place, and code motion would happen around them.</div></div></blockquote><div class=""><br class="">In this case there would be no intrinsic, just a new kind of "statement" DIScope? yeah, that's the best idea I've had/heard of/discussed for statement grouping so far. Certainly seems imminently prototype-able & then get a sense for just how expensive scoping everything like that would be.<br class=""> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div dir="ltr" class=""><div class=""> Late in the codegen pipeline, the first instruction belonging to the most recently activated statement will be emitted with the is_stmt flag.</div></div><br class=""><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Wed, Apr 29, 2020 at 9:36 AM Adrian Prantl <<a href="mailto:aprantl@apple.com" target="_blank" class="">aprantl@apple.com</a>> wrote:<br class=""></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">Thanks to all of you for sharing your perspective! You all brought up important aspects that I hadn't considered. The argument that really clicked for me was Pavel's that you might want to change the value before the load happens.<br class="">

<br class="">

One of my worries here is that what I called "less interesting" locations might delete more interesting ones when instructions and their locations are merged. But if we indeed consider the stack slot loads to be interesting then that is really the best we can do. After all, storing more than source location per instruction would be a UI design nightmare for a debugger.<br class=""></blockquote></div></blockquote><div class=""><br class="">Actually I'd kind of love this - I think it'd be really great to describe source ranges instead of source locations - probably super size costly though (& intermediate compiler memory usage, etc). Being able to describe nesting, etc - the same way Clang does with expression/subexpression highlighting I think would be wonderful - imagine if every instruction weren't just attributed to one location, but to a source range with a preferred location (instead of just pointing to the "+" for an add, be able to highlight the LHS and RHS of that plus expression, etc).<br class=""> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div class="gmail_quote"><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">

<br class="">

<br class="">

Reid, I would like to learn more about what you mean by statement tracking. I'm thinking of a statements as the expressions separated by semicolons, but since my whole example was a single statement I'm assuming you have something more fine-grained in mind. Perhaps you can post an example to illustrate what you have in mind?<br class="">

<br class="">

<br class="">

thanks!<br class="">

-- adrian</blockquote></div>

</blockquote></div></div>

_______________________________________________<br class="">

cfe-dev mailing list<br class="">

<a href="mailto:cfe-dev@lists.llvm.org" target="_blank" class="">cfe-dev@lists.llvm.org</a><br class="">

<a href="https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev" rel="noreferrer" target="_blank" class="">https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev</a><br class="">

</blockquote></div>

_______________________________________________<br class="">cfe-dev mailing list<br class=""><a href="mailto:cfe-dev@lists.llvm.org" class="">cfe-dev@lists.llvm.org</a><br class="">https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev<br class=""></div></blockquote></div><br class=""></div></body></html>