[PATCH] D152159: [AppleAccelTable][NFC] Refactor iterator class

Tue Jun 6 15:28:19 PDT 2023

aprantl added inline comments.

================
Comment at: llvm/lib/DebugInfo/DWARF/DWARFAcceleratorTable.cpp:279
+        FormValue.extractValue(Table.AccelSection, Offset, Table.FormParams);
+    assert(ExtractWorked && "Failed to extract value from hash data entry");
+  }
----------------
fdeazeve wrote:
> dblaikie wrote:
> > aprantl wrote:
> > > fdeazeve wrote:
> > > > dblaikie wrote:
> > > > > fdeazeve wrote:
> > > > > > dblaikie wrote:
> > > > > > > fdeazeve wrote:
> > > > > > > > aprantl wrote:
> > > > > > > > > What does it mean for `ExtractWorked` to be false?
> > > > > > > > > If this can be triggered by malformed input, even an asserts-enabled parser needs to handle it gracefully. Asserts should only be used for internal consistency checks that may never fail.
> > > > > > > > >If this can be triggered by malformed input, even an asserts-enabled parser needs to handle it gracefully. 
> > > > > > > > 
> > > > > > > > That's what I mean in the commit message with:
> > > > > > > > 
> > > > > > > > > We maintain the already existing assumption that failures never happen, but now
> > > > > > > > > make it explicit with an assert.
> > > > > > > > 
> > > > > > > > Today, if we fail to extract, we're going to start producing completely incorrect data and be silent about it. Imagine one of the iteration fails: the offset doesn't get updated and the next iteration is going to re-use the offset of the previous iteration.
> > > > > > > > 
> > > > > > > > IMO we're not handling it gracefully today, that's why I added the assert. But I don't feel strongly about this (just a bit :) )
> > > > > > > > 
> > > > > > > > I think the only way to fix this correctly is to remove the iterator abstraction and use callbacks instead. Non-end iterators should never fail the dereference operator, but data extracts are inherently a "fail-able" abstraction.
> > > > > > > I believe @lhames and I spoke about fallible iterators previously & he's used some iterator abstractions with fallibility (which can be nicer than a callback - for instance, allowing early exit or other novel iteration sequences that a callback API might not provide in a general way) - by having the iterator terminate early (when it hits an error, it becomes == end) and having a member you have to inspect after the fact to query for the error state/if it's failed.
> > > > > > >  and having a member you have to inspect after the fact to query for the error state/if it's failed.
> > > > > > 
> > > > > > That's an interesting idea, but how would that work with range-based for loops, as they don't expose the iterator itself to the programmer?
> > > > > > 
> > > > > > Are you aware of examples of such iterators inside LLVM?
> > > > > > 
> > > > > > > having the iterator terminate early 
> > > > > > 
> > > > > > Note how this also forces the iterator to know how to set itself to the end iterator, which is ok but requires more bookkeeping.
> > > > > > 
> > > > > > 
> > > > > > One other possibility we could try is to have the iterator's `operator *` return an optional. However, the current implementation of these iterators (both before and after this patch) avoid memory operations by keeping an `Entry` inside the iterator; the optional would inevitably have copy that. Unless we use pointers as optionals? (i.e. `operator*` returns an `Entry*`) 
> > > > > > 
> > > > > > While what I'm saying applies only for apple_names, the standardized `debug_names` implementation seems to follow a similar pattern, though the code there is even more complex and I haven't yet been able to digest it.
> > > > > > 
> > > > > > Another thing we could do is to eagerly read all the data (except maybe for the strings) from the section and build a proper data structure with the FormValues and hashes and offsets already parsed. So we never have a chance to error again after the constructor is finished. That said, I suspect the original implementors didn't do this for performance / memory reasons? It's interesting to consider how some queries from LLDB force us to search/materialize all of the table anyway. For example, by search by regex (gotta read all the strings) or by DIE whose offset is inside a given range (gotta read all the HashData to find the DIE offset)
> > > > > Oh, actually - maybe we settled on a different idiom - an out parameter from the range accessor, that's compatible with range-for loops, it's even documented: https://llvm.org/docs/ProgrammersManual.html#building-fallible-iterators-and-iterator-ranges
> > > > > 
> > > > > & yeah, certainly one of Lang's goals with libObject was to make it as lazy as possible so as not to do any extra work/hurt performance but also to be as usable as possible on potentially invalid object files - be able to parse all the parts that can be parsed, without tripping over on the first failure and being unable to continue.
> > > > > 
> > > > > The LLVM libDebugInfoDWARF code would probably benefit from similar goals - same idea that it might be used to investigate invalid inputs, and being able to keep going as much as possible/not trip over unrelated errors would be a valuable feature.
> > > > Oh very interesting, I think we can make this work on top of the cleanups of this patch, so I will give it a try as a follow-up patch. Thank you for the references!
> > > > 
> > > > >  one of Lang's goals with libObject was to make it as lazy as possible so as not to do any extra work/hurt performance
> > > > 
> > > > Maybe I'm being naive here, but it's not clear to me whether being lazy (for the bucket/hash/offset/string relocation resolving) is the right call, since we don't cache anything for future use. Granted, if we do `<= 1` queries that visit the entire table, then it is a win. But the moment we do the second query, it would have been faster to rely on a pre-processed data structure, as these data extractors are heavy-weight.
> > > With LLDB as the consumer in mind there are two scenarios we want to optimize for:
> > > 
> > > 1. debugging with a Mach-O debug map: Traversing through thousands of .o files when looking up a DIE by name, most of the time getting a "no" result
> > > 
> > > 2. lookups in a single, very large accelerator table (.dSYMs, ELF binaries, ...)
> > > 
> > Sorry I'm not following this in detail, so feel free to ignore this if it's not worth the hassle to explain.
> > 
> > You mention these are heavyweight structures - that makes me concerned - in theory the accelerator tables were/are meant to be cheap to use from-disk. So I would expect we shouldn't have to/shouldn't be building heavyweight data structures to use these?
> Full disclosure, I have not looked at what an optimized code for the final "equal_ranges" function looks like, but my claim boils down to the following:
> 
> To perform the most basic lookup ("is this string in the table"), we have to call `DataExtractor::getU32` at least 4 times (more if the hash we're looking for is not the first hash in the bucket). That function boils down to a call to `DataExtractor::getU`: https://llvm.org/doxygen/DataExtractor_8cpp_source.html#l00040
> That code is not exactly simple.
> 
> If we were working with a pre-processed data structure for the hashes/buckets, this would be simple 4 integer accesses in memory (after we paid all the cost in advance once). All the design decisions from the accelerator tables indicate that the intent was that queries should always be these simple memory accesses, but there is an inherent cost in having to convert from the "object file" representation (i.e. all the boilerplate in `getU`)
> 
> (Just to make sure we're on the same page, these patches are not changing anything about that in either direction)
I'm not actually sure if I've ever seen the accelerator table parsing show up as being a major time sink in a trace of LLDB before.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D152159/new/

https://reviews.llvm.org/D152159