[llvm-dev] [RFC][DWARF] Handling errors in DWARF .debug_line parsing

Mon Dec 16 11:26:21 PST 2019

(mailing lists can be busy - best to include any known-relevant parties on
the 'to' line to help highlight the discussion for them)

My take on it: I'd be OK with only a strict mode with good error messages
lazily created. (so you don't get an error if you never needed to
parse/lookup a file name, for instance - don't want the strict mode to be
inefficient by aggressively parsing portions of the file you don't end up
needing) If there are cases where consumers just need to be able to read
invalid DWARF, yeah, could jump that hurdle when it comes up in a few ways
I guess.

On Mon, Dec 16, 2019 at 3:03 AM James Henderson via llvm-dev <
llvm-dev at lists.llvm.org> wrote:

> Hi all,
>
> I'm preparing to propose some additional error checks for the DWARF debug
> line parser which we have locally. However, there have been some valid
> comments in the past that some errors really shouldn't prevent parsing for
> certain situations. For example, a line_range of 0 would only be an issue
> if anything needed that information (and would consequently divide by
> zero), whilst the include directories and file names table should be
> terminated with null bytes, but the parser could use the header length
> field to identify when the tables have ended instead.
>
> After a brief discussion with Paul Robinson offline, I think it would be
> good to add different parsing modes to the debug line parser. I've got
> three possible categories:
> 1) "Dump" mode - this would be the most lenient mode. It is intended for
> tools like llvm-dwarfdump that merely want to print the contents to the
> best of their ability, even if the table is malformed/dodgy in some other
> way. It might print warnings, but these should not prevent it continuing to
> parse what it can.
> 2) "Strict" or "Verify" mode - this would be the strictest mode, which
> would emit an error if it sees things like the aforementioned bad
> line_range or filenames tables. Optionally, we could even extend this to
> also cover things like addresses that don't make sense (possibly with or
> without special handling for fixups for GC-sections-processed output), or
> that could be a further mode.
> 3) "Consumer" mode - this would be somewhere in between the two above
> modes and would essentially do whatever a consumer such as LLDB wants. It
> might not need to be as strict as the "Strict" mode, but some guidance here
> from the consumers would be best.
>
> I haven't yet worked out exactly how this would work, but I think it would
> probably fit into the existing error handling mechanism, either with some
> extra conditionals or similar that say whether to report an error or not. I
> will hopefully be putting up some proposed patches later this week that
> illustrate all this.
>
> I'd appreciate any thoughts people have on the different modes, whether we
> would need more/fewer, what they'd each cover etc.
>
> Regards,
>
> James
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20191216/6d4a52d1/attachment.html>