[PATCH] D44560: [DWARF] Rework debug line parsing to use llvm::Error and callbacks

Wed Mar 21 06:36:05 PDT 2018

jhenderson added a comment.

In https://reviews.llvm.org/D44560#1043915, @espindola wrote:

> I don't think we missed it. The callback is used when there is any way that the parsing can continue. The callback should be passed information about the issue. What should not happen is for the code is the library to have the notion of an Error severity.
>
> We have discussed two use cases. In dwarfdump we want to parse as much as we can and print whatever issues are found to stderr.
>
> In lld we would be happy to stop as soon as any issue is found, but as you point out that is not required and we can also just print every issue to stderr.
>
> Neither of these suggests the need for a severity. In fact, given the above it seems the callback can return void, take a StringRef and the parsing code produces an Error only when it cannot recover. What use case do you have in mind that needs more than this?

I think the phrase "parsing can continue" may be a bit overloaded in the current context: the parser only deals with a single table at a time. In some situations, it sees issues that don't preventing parsing of the rest of that table, where in the existing behaviour it just prints to stderr and continues. Other errors cause the parser to return false, and to print a warning.

Some callers, such as llvm-dwarfdump, loop over and parse all the tables, rather than a specific one at a given offset. It is this additional behaviour that needs supporting. Hopefully the following explanation should clear that up a bit more:

In llvm-dwarfdump, the current behaviour, prior to my change, is to stop as soon as any line table is found that cannot be fully parsed, or to continue parsing **that** line table after printing a warning, if a minor issue is found that does not make the line table unreadable. Preventing further parsing is achieved by the parser resetting the offset value and the caller checking if the offset has been reset. I think we all agree that this is not ideal, because there is the potential to be able to parse later tables after the broken one. However, in some situations, this is not possible, because the unit length is broken in some crazy way (in this diff, only possible if illegally using a reserved value, although I plan to add more checks later), so we need to be able to distinguish between this in different ways. The parser could indicate this in one of four ways that I can think of:

1. As with the existing behaviour, reset the length field, and then have the caller check this. The callback would be used in the same manner as the current diff (i.e. for errors that don't prevent continuation of parsing the current table). Other problems would be returned as Errrors (there would be no need for additional information in this case).
2. Add a method to `LineTable::Prologue` that states whether the unit length is valid or not somehow. The caller would then be responsible for checking this, if it might want to continue onto the next table (as in the dwarfdump case), but can also ignore it and stop. The callback would be used in the same manner as above.
3. As in the current version of this diff, provide information in the Error returned to the caller, which the caller uses, if it wishes to. Again, the callback would be used as above.
4. Pass additional information into the callback, making the callback signature `Error(bool IsUnitLengthValid, StringRef Msg)` or similar. The parser should always bail out with the returned Error, even for Error::success(), if an issue is found preventing further parsing of that table, but when it knows that it can carry on, it can check against Error::success().

In the first three cases, whether or not the callback should be changed to return an Error (potentially Error::success()), instead of void, depends on whether there is a user for it. I can certainly imagine there being one, but there currently isn't.

Repository:
  rL LLVM

https://reviews.llvm.org/D44560