[llvm-dev] DWARF debug line error handling changes

Mon Jan 27 15:22:00 PST 2020

Thanks for all the work & context, James!

On Mon, Jan 27, 2020 at 5:51 AM James Henderson <
jh7370.2008 at my.bristol.ac.uk> wrote:

> Hi all,
>
> Since December, I've made several changes to the DWARF debug line parser
> to improve its error handling. Some examples include removing redundant
> error checks[1] and making address size mismatches non-fatal to parsing[2],
> with several more about to land or being planned.
>
> David Blaikie recommended I post something to give a bit more context to
> these changes. I am a member of Sony's Linker and Binutils team, and as
> part of the linking process, we make updates to the debug line section in
> order to handle GC-section processing and similar modifications. In order
> for these operations to be safe, we need the input line program to be in a
> good state. As a result, in addition to the existing LLVM error checks, we
> have added several additional downstream checks. These were added
> downstream at the time for speed, but we've always had the intention of
> bringing these to the wider community. We now have the time to do this.
>
> There are broadly-speaking two kinds of changes we are making:
>
> 1) Adding additional new checks to make sure the table is valid. Where
> possible, these are made as late as reasonable. That way, we don't emit
> errors or warnings until it is necessary, and potentially therefore not at
> all in some cases.
>
> 2) Making existing error checks less fatal, i.e. changing the code to
> allow the parser to continue even though something doesn't look right in
> the code. This allows consumers to gather more information, whether for
> displaying or otherwise. In several cases, the change is to assume that the
> length recorded in a field is correct, even if it doesn't match what either
> the standard says should be the length.
>

If the standard conflicts with the length when we read the length (without
reading further) - yeah, I think we should fail/warn/something there. (as I
think we do - if you have a length that's not long enough for the header
we're expecting to read, for instance, etc) & if that means this
contribution can't be parsed (eg: header's truncated) we treat it that way
(this contribution is broken) & can still go on to the next contribution
and read that.

> This won't always be the right thing to do (it might be that just the
> length field is correct), but providing more information, together with a
> warning that should indicate that something may not be quite right, is
> surely more useful than refusing to give the extra information. I picked
> the recorded length because other areas of the code already assume it to be
> correct, and use it to iterate onto other parts of the line table section.
>

Hmm, I thought I saw somewhere (in the many active reviews at the moment,
some from Sony, some from some other folks - haven't got the full picture
myself) I thought someone, maybe Paul I thought, said something about
stopping/not continuing when we found an error in a previous contribution -
which doesn't sound right (& is different from what you've just said here).
Ah well, maybe I'm hallucinating.

It might be interesting to have some harder errors if contributions ever
overlap (if two CUs point to two different parts of debug_line, for
instance, and one of those pointers+parsed length overlaps with the other
pointer). That's a bad place to be and may mean we should give up on
parsing any of debug_line, or at least any unit that has such an overlap -
but harder to detect.

The alternative, I suppose is to go the other way (& that's something that
might be good/important to do at some point) & not parse any contributions
that aren't pointed to by a CU (well, debug_line in DWARFv5 is designed to
be parsed standalone too - and maybe everything in DWARFv5 is designed to
be parsed standalone, but we don't know whether a section contains only v5
contributions (debug_line maybe should've changed name so we'd know it only
contained v5 contributions - like loclists and rnglists - ah well)... )

Just rambling at this point.

- Dave

>
> I hope this provides greater context for these changes!
>
> Regards,
>
> James
>
> [1]
> https://github.com/llvm/llvm-project/commit/418cd8216b41f4c08e0e1b22feda381d9b2345da
> [2]
> https://github.com/llvm/llvm-project/commit/07804f75a6cc506fada40c474f1e60840ce737d8
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20200127/449acbf4/attachment.html>