<div dir="ltr"><br><br><div class="gmail_quote"><div dir="ltr">On Thu, Apr 27, 2017 at 1:12 PM Robinson, Paul via llvm-dev <<a href="mailto:llvm-dev@lists.llvm.org">llvm-dev@lists.llvm.org</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">The next feature on my DWARF 5 list is the line-table header. While this<br>
is pretty easy generate, it is a real bear to parse, so I thought I should<br>
let y'all know what I'm up to and why as I head out to the yak farm. Any<br>
thoughts and suggestions would be very much appreciated.<br></blockquote><div><br>Thanks a bunch for sending this email! - I'd love to see more like this when large pieces are undertaken in LLVM for just these reasons, so we can all get a sense of where things are aiming, the motivation, etc.<br> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">The v5 directory and file tables no longer have a fixed format; instead,<br>
we have a list of field descriptors followed by the fields for each entry<br>
in the directory or file table. Normally the directory table would have<br>
one descriptor:<br>
DW_LNCT_path, DW_FORM_string<br>
This tells us each entry contains a pathname encoded as an inline string.<br>
(Which is essentially how the v4 directory table is encoded.) However,<br>
because of the FORM code, we now have whole new worlds of complication<br>
regarding where the actual string might be. We might have DW_FORM_strp<br>
which puts the actual string in the .debug_string section; eventually we<br>
could have DW_FORM_line_str (pointing to .debug_line_str) </blockquote><div><br>What's DW_FORM_line_str/debug_line_str for? (so the line table can be kept while strippnig the rest of the debug info, including its strings?)<br> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">or even<br>
DW_FORM_strx (indirecting through .debug_str_offsets).<br>
<br>
Conveniently, we have the DWARFFormValue class which knows how to decode<br>
data based on what the form code is.<br>
<br>
Inconveniently, DWARFFormValue assumes it is looking at a .debug_info<br>
section, and picks up its relocations from a DWARFUnit. But if we're<br>
using DWARFFormValue to decode data from .debug_line, then it needs a<br>
different relocation map.<br></blockquote><div><br>I'm going to assume there's going to be similar inconvenience on the other side (the emission side).<br> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">It's only the string data that causes a problem; all the other kinds<br>
of data in the file table are constants, and retrieving constants<br>
with DWARFFormValue is no problem.<br>
<br>
<br>
I think the right tactic is a "top-down" approach, starting by teaching<br>
DWARFDebugLine to parse a v5 line-table header but support only<br>
DW_FORM_string for the paths. This should let me use an unmodified<br>
DWARFFormValue to parse the directory and file tables.<br></blockquote><div><br>Any idea what form you'll be using for LLVM's emisison? LLVM currently only emits strp - figure the same for the line table? Or more likely to use _string unconditionally?<br><br>In any case - if/when you have the right format support in llvm-dwarfdump, you could go ahead and implement the output code in LLVM's codegen, even before llvm-dwarfdump can handle every arcane format that any DWARF producer might decide to use. (& then you can continue implementing those - but it'd get you the LLVM functionality sooner, rather than gating it on having a fully general parser)<br><br>This approach has certainly been taken in the past - implementing enough dumping support as needed for LLVM's generation functionality & expanding as-needed.<br> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">From there, teaching DWARFFormValue to handle DW_FORM_strp from the<br>
.debug_line section should be pretty well motivated and it should be<br>
straightforward to see what's really needed in terms of the API.<br>
<br>
Once we get that far, I would hope that the line_str and strx<N> forms<br>
would not require much additional effort. Actually Wolfgang is<br>
separately working on the strx<N> forms so with any luck that would<br>
Just Work for the .debug_line section.<br>
<br>
Oh yeah, after all that I'd actually generate the v5 header from LLVM.<br>
The idea is that by then, I can use llvm-dwarfdump to validate it and<br>
be very confident that it would all work.<br>
<br>
Does all that sound like a plan? The alternative would be to try to<br>
teach DWARFFormValue to handle DW_FORM_strp from .debug_line up front,<br>
but I think we might rather go at this in smaller pieces.<br>
<br>
Thanks,<br>
--paulr<br>
<br>
_______________________________________________<br>
LLVM Developers mailing list<br>
<a href="mailto:llvm-dev@lists.llvm.org" target="_blank">llvm-dev@lists.llvm.org</a><br>
<a href="http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev" rel="noreferrer" target="_blank">http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev</a><br>
</blockquote></div></div>