<html><head><meta http-equiv="Content-Type" content="text/html; charset=utf-8"></head><body style="word-wrap: break-word; -webkit-nbsp-mode: space; line-break: after-white-space;" class=""><br class=""><div><br class=""><blockquote type="cite" class=""><div class="">On Feb 27, 2019, at 3:14 PM, Zachary Turner <<a href="mailto:zturner@google.com" class="">zturner@google.com</a>> wrote:</div><br class="Apple-interchange-newline"><div class=""><div dir="ltr" class=""><br class=""><br class=""><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Wed, Feb 27, 2019 at 2:52 PM Frédéric Riss <<a href="mailto:friss@apple.com" class="">friss@apple.com</a>> wrote:<br class=""></div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div style="word-wrap:break-word;line-break:after-white-space" class=""><div class=""><blockquote type="cite" class=""><div class="">On Feb 27, 2019, at 10:12 AM, Zachary Turner <<a href="mailto:zturner@google.com" target="_blank" class="">zturner@google.com</a>> wrote:</div></blockquote></div></div><br class=""><div style="word-wrap:break-word;line-break:after-white-space" class=""><div class=""><br class=""><blockquote type="cite" class=""><div dir="ltr" style="font-family:Helvetica;font-size:12px;font-style:normal;font-variant-caps:normal;font-weight:normal;letter-spacing:normal;text-align:start;text-indent:0px;text-transform:none;white-space:normal;word-spacing:0px;text-decoration:none" class=""><div class="gmail_quote"><div class="">For what it's worth, in an earlier message I mentioned that I would probably build the server by using mostly code from LLVM, and making sure that it supported the union of things currently supported by LLDB and LLVM's DWARF parsers. Doing that would naturally require merging the two (which has been talked about for a long time) as a pre-requisite, and I would expect that for testing purposes we might want something like llvm-dwarfdump but that dumps a higher level description of the information (if we change our DWARF emission code in LLVM for example, to output the exact same type in slightly different ways in the underlying DWARF, we wouldn't want our test to break, for example). So for example imagine you could run something like `lldb-dwarfdump -lookup-type=foo a.out` and it would dump some description of the type that is resilient to insignificant changes in the underlying DWARF.</div></div></div></blockquote><div class=""><br class=""></div></div></div><div style="word-wrap:break-word;line-break:after-white-space" class=""><div class=""><div class="">At which level do you consider the “DWARF parser” to stop and the debugger policy to start? In my view, the DWARF parser stop at the DwarfDIE boundary. Replacing it wouldn’t get us closer to a higher-level abstraction.</div></div></div></blockquote><div class="">At the level where you have an alternative representation that you no longer have to access to the debug info. In LLDB today, this "representation" is a combination of LLDB's own internal symbol hierarchy (e.g. lldb_private::Type, lldb_private::Function, etc) and the Clang AST. Once you have constructed those 2 things, the DWARF parser is out of the picture.</div><div class=""><br class=""></div><div class="">A lot of the complexity in processing raw DWARF comes from handling different versions of the DWARF spec (e.g. supporting DWARF 4 & DWARF 5), collecting and interpreting the subset of attributes which happens be present, following references to other parts of the DWARF, and then at the end of all this (or perhaps during all of this), dealing with "partial information" (e.g. something that would have saved me a lot of trouble was missing, now I have to do extra work to find it).</div><div class=""><div class=""><br class="inbox-inbox-Apple-interchange-newline">I'm treading DWARF expressions as an exception though, because it would be somewhat tedious and not provide much value to convert those into some text format and then evaluate the text representation of the expression since it's already in a format suitable for processing. So for this case, you could just encode the byte sequence into a hex string and send that.</div><br class="inbox-inbox-Apple-interchange-newline"></div><div class="">I hinted at this already, but part of the problem (at least in my mind) is that our "DWARF parser" is intermingled with the code that *interprets the parsed DWARF*. We parse a little bit, build something, parse a little bit more, add on to the thing we're building, etc. This design is fragile and makes error handling difficult, so part of what I'm proposing is a separation here, where "parse as much as possible, and return an intermediate representation that is as finished as we are able to make it".</div><div class=""><br class=""></div><div class="">This part is independent of whether DWARF parsing is out of process however. That's still useful even if DWARF parsing is in process, and we've talked about something like that for a long time, whereby we have some kind of API that says "give me the thing, handle all errors internally, and either return me a thing which I can trust or an error". I'm viewing "thing which I can trust" as some representation which is separate from the original DWARF, and which we could test -- for example -- by writing a tool which dumps this representation</div></div></div></div></blockquote><div><br class=""></div><div>Ok, here we are talking about something different (which you might have been expressing since the beginning and I misinterpreted). If you want to decouple dealing with DIEs from creating ASTs as a preliminary, then I think this would be super valuable and it addresses my concerns about duplicating the AST creation logic.</div><div><br class=""></div><div>I’m sure Greg would have comments about the challenges of lazily parsing the DWARF in such a design.</div><br class=""><blockquote type="cite" class=""><div class=""><div dir="ltr" class=""><div class="gmail_quote"><div class=""> <br class=""></div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div style="word-wrap:break-word;line-break:after-white-space" class=""><div class=""><div class=""><br class=""></div><blockquote type="cite" class=""><div dir="ltr" style="font-family:Helvetica;font-size:12px;font-style:normal;font-variant-caps:normal;font-weight:normal;letter-spacing:normal;text-align:start;text-indent:0px;text-transform:none;white-space:normal;word-spacing:0px;text-decoration:none" class=""><div class="gmail_quote"><div class="">At that point you're already 90% of the way towards what I'm proposing, and it's useful independently.</div></div></div></blockquote></div></div><div style="word-wrap:break-word;line-break:after-white-space" class=""><div class=""></div><div class=""><br class=""></div><div class="">I think that “90%” figure is a little off :-) But please don’t take my questions as opposition to the general idea. I find the idea very interesting, and we could maybe use something similar internally so I am interested. That’s why I’m asking questions.</div></div></blockquote><div class=""> </div><div class="">Hmm, well I think the 90% figure is pretty accurate. Because if we envision a hypothetical command line tool which ingests DWARF from a binary or set of binaries, and has some command line interface that allows you to query it in the same way our SymbolFile plugins can be queried, and dumps its output in some intermediate format (maybe JSON, maybe something else) and is sufficiently descriptive to make a Clang AST or build LLDB's internal symbol & type hierarchy out of it, then at that point the only thing missing from my original proposal is a socket to send that over the wire and something on the other end to make the Clang AST and LLDB type / symbol hierarchy.</div></div></div>
</div></blockquote></div><br class=""><div class="">A more accurate reflection of my feelings would have been “those 90% seem harder to achieve than you think”. I obviously have no data to back this up, so please prove me wrong!</div><div class=""><br class=""></div><div class="">Fred</div></body></html>