[Lldb-commits] [PATCH] D32167: Add support for type units (.debug_types) to LLDB in a way that is compatible with DWARF 5

Pavel Labath via Phabricator via lldb-commits lldb-commits at lists.llvm.org
Fri May 11 03:21:12 PDT 2018


labath added a comment.

The thing here is that I don't see how can be done in a non-hacky way in DWARFDataExtractor while DWARFDataExtractor remains a (subclass of) DataExtractor. And it is a hack because as soon as you slide the extractor, a number of its functions become booby-trapped (explode as soon as you call them) or just return nonsensical results. And I'd like us to implement it in a way that it is not a hack, on any level.

I see several ways of achieving this (in decreasing order of intrusiveness):

1. do this in the DataExtractor class. This would require us to change (extend) the definition of the class. Right now, I would define this class as "a view of a range of bytes with a bunch of utility functions for accessing them". After this change it would become something like "a view of a range of bytes with a bunch of utility functions for accessing them, **where the accessor functions automatically apply a bias to all offsets**". That's precisely the kind of change you are making now, only the difference would be that we would state this up-front rather than doing it behind the DataExtractor's back and hoping it works. In practice, this would be as simple as adding another `offset_t m_bias;` field to the class and going through it's methods to make sure the do something reasonable with it. "Something reasonable" would generally mean "subtract it from the input offset", but there are a couple of functions that would need to do something different, (e.g.,  ValidOffset() would return true for values from `m_bias` to `m_bias + (m_end-m_start) -1` because those are the only valid offsets, etc.). This is my preferred solution because it provides with a consistent DataExtractor interface (and in terms of code churn, I don't think it's harder to implement than any of the other solutions). It's true that dwarf type units will probably be the only user of these "slid" extractors, but I don't think that matters as their functionality can be defined in a clear and self-consistent manner without any references to dwarf.

2. Add the `m_bias` field to the DWARFDataExtractor class, but stop the class from publicly inheriting from DataExtractor (it can inherit privately, or have it as a member). The result of this is the same as the first option - we get to go through the DataExtractor interface and make sure all its functions do something sensible. The difference is that we may not need to go through that all methods, only those that are used for dwarf parsing; and the sliding trick can then be kept dwarf-specific.

3. Keep DWARFDataExtractor as-is, and add a new SlidingDWARFExtractor class instead (which will then to the `m_bias` trick and so on); then convert all places that need this to use the new class instead. This is a variation on the second option. I don't see it as particularly interesting, as it the DWARFDataExtractor is already a very small class, but it would still provide us self-consistent APIs everywhere.

I can also see some kind of a solution involving a ConcatenatingDataExtractor, which would contain two or more DataExtractors as it's backing store and then pretend they represent one continuous address space from `0` to `sum(sizes)` (no sliding necessary). This would achieve what you wanted at a higher level. In a way it would be nicer because there would be a single canonical bytestream for each symbol file and you wouldn't have to ask each DIE for it's own extractor, but I expect it to be a bit harder to implement, and may have a runtime cost.

The next solution which does not require sliding is to modify your previous patch `D46606` and insert some kind of offset fixup code there. It seems that in every place you call the new `GetData()` function, you have an offset around (makes sense, as the reason you are calling GetData is because you want to parse something). We could modify that function to take a virtual (slid) offset as a parameter, and return an offset which is suitable for usage in the returned extractor. So, the usage would become something like `std::tie(extractor, real_offset) = die.GetData(virtual_offset);`. I'm not sure how nice this will end up in practice, but it's a possibility.

Let me know what you think of these. I'm open to other options as well. The thing I'd like avoid is an implementation that relies on putting objects in an invalid state or invoking undefined behavior.


https://reviews.llvm.org/D32167





More information about the lldb-commits mailing list