[PATCH] D152947: [DebugInfo] Change DWARFDebugAbbrev initialization

Wed Jun 14 16:44:55 PDT 2023

bulbazord added a comment.

In D152947#4423100 <https://reviews.llvm.org/D152947#4423100>, @dblaikie wrote:

> FWIW - I'd actually be inclined to consider going in the opposite direction, to parsing the abbrevs even more lazily... I'd been looking into the performance of `llvm-symbolizer` without `.debug_aranges` (only CU ranges) and abbrev parsing ends up taking a not insignificant chunk of the time - not a deal breaker, but a bit of work that could be avoided if we lazily read abbrevs (in most cases we'd only need to read the first apprev - likely where the CU abbrev description is)
>
> I'd prototyped a really hacked up patch that might give the general idea: https://pastebin.com/md943qvf

I'm also interested in improving the time it takes to parse the abbrev section, I've seen it take up a non-trivial amount of time when parsing debug info in lldb.

One thing I've been working on for the last month or two is replacing portions of lldb's dwarf parsing logic with llvm's so that we can have one implementation. The process is non-trivial because there are a few differences between their logic. One thing that lldb's parser does really well is error handling, so I've been sinking a lot of that error handling into llvm's implementation. So far I've been able to remove lldb's `DWARFAbbreviationDeclaration` and `DWARFAbbreviationDeclarationSet` classes. I hope to be able to do the same for `DWARFDebugAbbrev`.

Quickly looking over your patch, I don't think that our goals are necessarily opposed. I think it's possible to be lazy and go fast while handling errors. There are 2 things I think need to happen before I can remove lldb's implementation of `DWARFDebugAbbrev`:

1. `DWARFDebugAbbrev` needs to meaningfully surface errors. Right now it does 0 error handling at all, so there's no insight into what failed or why when an error occurs. This was true for the other 2 classes a few months ago, now they are better. I want to eventually have `parse` and `getAbbreviationDeclarationSet` return `llvm::Error` and `llvm::Expected` respectively.
2. `DWARFDebugAbbrev::begin()` needs to stop calling `parse`. There's no way to meaningfully surface an error from `begin`. I want to instead force all users to call `parse` explicitly and handle an error before being able to iterate over all the DeclSets. If you want just one specific DeclSet, you wouldn't have to pay the cost of parsing everything, just the cost of error handling from that one extraction. This patch is the first step towards doing that, specifically I want `Data` to always be not `std::nullopt` while there is still information to parse. I'm open to different approaches here though.

What do you think about this? I'm happy to chat further.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D152947/new/

https://reviews.llvm.org/D152947