[PATCH] Move DWARFUnitSection from the vector API to something more semantically accurate.

Wed Sep 17 16:30:07 PDT 2014

On Wed, Sep 17, 2014 at 3:38 PM, Alexey Samsonov <vonosmas at gmail.com> wrote:

>
> On Wed, Sep 17, 2014 at 1:56 AM, Frédéric Riss <friss at apple.com> wrote:
>
>>
>> On 17 Sep 2014, at 08:47, David Blaikie <dblaikie at gmail.com> wrote:
>>
>>
>>
>> On Tue, Sep 16, 2014 at 11:35 PM, Frédéric Riss <friss at apple.com> wrote:
>>
>>>
>>> On 17 Sep 2014, at 08:11, David Blaikie <dblaikie at gmail.com> wrote:
>>>
>>>
>>>
>>> On Tue, Sep 16, 2014 at 10:47 PM, Frederic Riss <friss at apple.com> wrote:
>>>
>>>> The Units method was suggested by Alexey I have no strong feeling
>>>> either way. With this patch applied, the inheritance of SmallVector isn't
>>>> necessary anymore, I didn't fold the move from inheritance to composition
>>>> in this patch, but it could be done.
>>>>
>>>
>>> Yeah, I saw from his CR feedback.
>>>
>>> To be honest, since seeing this patch I've been dabbling with some
>>> changes of my own to demonstrate a possible direction.
>>>
>>>
>>>> The next step for me is to factor the the parsing code into this class.
>>>>
>>>
>>>
>>> The thing is, I think factoring more of the logic, such as the parsing,
>>> into this class will actually eventually remove the need to expose the
>>> container-like API of this class at all.
>>>
>>> My patch is probably a bit over the top, but for demonstration, I've
>>> included it (locally I've got it broken down into smaller steps).
>>>
>>> The only thing missing to make DWARFUnitSection not expose begin/end
>>> (and thus, not expose any container-like API) is to
>>>
>>>
>>>> There is an intermediate step though: the actual code is semantically
>>>> wrong when it comes to type units. Type units might come from different
>>>> sections, but we put them all in the same DWARFUnitSection object.
>>>
>>>
>>> Why is this a problem? Oh, I suppose getUnitForOffset doesn't work so
>>> well when it's not a contiguous non-overlapping range?
>>>
>>> Fair enough.
>>>
>>> Note that it's only non-fission type units that come from different
>>> sections. (& I don't know how comdat sections look once they're all linked
>>> and deduplicated - so I guess it might be only type units in object files
>>> (not executables or .dwo files) that are in their own sections)
>>>
>>>
>>> Yes, this will be the case in object files. And the issue really is
>>> getUnitForOffset getting confused (after all the thinking we put into this
>>> one, that would be a pity, wouldn’t it? :-))
>>>
>>>
>>>
>>>> Once this is fixed, we can move parsing into DWARFUnitSection and
>>>> remove at least the addUnit method.
>>>>
>>>
>>> Fair enough - the attached patch just moves it as-is, which is no worse
>>> than the current state, I don't think.
>>> 
>>>  DWARFUnitSection.diff
>>> <https://docs.google.com/file/d/0B0jpkch3iC_7ZXRwR1d5cGFOcVk/edit?usp=drive_web>
>>>
>>>
>>> I don’t have permission to access that file.
>>>
>>
>> Sorry about that - seems gmail's a bit hit-or-miss with the "attach from
>> drive" permissions updates. It should be viewable (& commentable, for
>> whatever that's worth - not sure that Google Docs comments make for a good
>> review tool) by anyone with the link now.
>>
>>
>>> I guess it’s pretty similar to what my topic branch looks like.
>>>
>>
>> Quite possibly, not sure. There's a few fun tools in there (had a bit of
>> fun adding some generic iterator utilities that I should probably find a
>> use for in tree and checkin anyway)
>>
>>
>> The end result in the DWARF classes is very similar, but yeah, I didn’t
>> have the iterator niceties. Few things I would have made differently:
>>  - I would have returned iterators rather than implement forEachLineTable
>>  - The collectAddressRanges looks strange (at least I wouldn’t have put
>> the first argument in there). I don’t think it brings us anything, as it is
>> used only once anyway.
>>  - I wouldn’t use the empty() property of the vector to check if the
>> section has already been parsed. Having managed built Units and having
>> tried to parse the section look different enough to me that I’d put a
>> separate boolean for the parsing.
>>
>> I guess it’s Alexey’s call how he wants to move forward on this.
>>
>
> Overall, I'm really satisfied with David's direction. Callables in
> forEachLineTable and collectAddressRanges() do indeed seem like an
> overkill, as these routines (esp. the last one, as Frederic points out)
> doesn't seem to be
> generally useful.
>

So we're all in agreement that we should expose (either via a range
accessor (Fred's "Units()") or begin()/end()) iteration access to the units
in a DWARFUnitSection, I think. (correct me if I'm wrong in that
understanding/agreement)

> I'm also mildly concerned about getAddressByteSize() piece - is it
> theoretically possible to have multiple compile units with different
> address byte sizes in real life? It looks strange to me, but as it's
> possible in DWARF, I think we should try to not hard-wire same address byte
> sizes if possible.
>

I agree - this mildly concerns me. For now I was just persisting the
existing implementation.

In reality I think this would mean we'd have to be more cautious about
dumping things for which we can't find the CU - eg: currently
debug_line.dwo and debug_ranges both rely on using /some/ byte size from
the last CU (assuming you're dumping CUs... otherwise they don't have an
address byte size at all? I don't know what they do then). In actuality
we'd have to be more careful and only dump ranges we can find via CUs (then
we wouldn't be able to dump an object file that only contains ranges -
because without the address size we can't read it correctly).

Currently the only use of DWARFUnitSection::getAddressByteSize (or
currently written as "getCompileUnitAtIndex(0)->getAddressByteSize()")
which could at least be adjusted to do the same thing we do for
debug_line.dwo/debug_ranges - but all 3 of them should be fixed to find the
line/ranges/locs via the CU and use the CU's address byte size.

Not sure what we should do with any chunks of these sections which we don't
cover via CU references...

- David

> Other than that, I'd like to see most of this patch landed :)
>

Fred - you're welcome to take any parts of my patch as inspiration, but it
sounds like we're in similar territory (& as I said, some of the fancier
bits of my patch aren't applicable once you account for the wrongness of
type unit handling which you pointed out/intend to fix)

>
>
>>
>> Fred
>>
>>
>>> 
>>> There's some extra tidying up I could do on that patch (& locally I've
>>> got it more granular) - but I figured it might be some use to you and/or
>>> Alexey to see where it might go. Like your patch, mine doesn't actually
>>> remove the inheritance - it just demonstrates (by making the inheritance
>>> private and not exposing any container-like operations or typedefs) that
>>> it's no longer relied upon by anything other than the implementation. At
>>> which point it's fairly easy to refactor it to composition.
>>>
>>>
>>> To test my patch, this is exactly the approach I used. I moved to
>>> protected inheritance and chose the APIs that would minimise the API
>>> surface while not requiring too much of a refactoring. The goal was simply
>>> to produce an easy to review ‘wrapper’ patch, and leave the bigger methods
>>> for later. I thought this is the approach you were suggesting. And yes,
>>> this introduces a bit of API that is meant to disappear soon, but it would
>>> have allowed me to propose the inheritance removal patch in parallel to
>>> more functionality refactoring ones.
>>>
>>
>> Fair enough - sorry, that's certainly clear now but was less so when this
>> patch (& the obseleted precursor) was first provided - I wasn't sure if
>> this was the only effort to remove the inheritance.
>>
>> For myself I'd (& I did, in the branch where I derived the attached patch
>> from) approach it by just doing the incremental API migrations up-front. So
>> long as it's clear that you're going to get to removing the inheritance
>> soon-ish and it's not going to languish (sorry, this is perhaps a case
>> where you're paying for the sins of previous developers (including
>> myself/ourselves, to be fair) - we tend to be a bit twitchy about "short
>> term workarounds" and vitally concerned about making sure they don't stick
>> around longer than they're welcome) in tree, there's no particular need to
>> front-load it than to just migrate an API at a time until the container
>> surface area goes to zero, then quietly remove the inheritance at the end.
>>
>> It's all good though.
>>
>> - David
>>
>>
>>>
>>> Fred
>>>
>>> (personally, I think moving the parsing and probably the dumping support
>>> there makes sense - the rest of it... I'd be just as happy to have
>>> begin/end exposed on the DWARFUnitSection and keep the last couple (the
>>> line table and arange handling) where they were, probably)
>>>
>>> Alexey - you mentioned preferring not to make DWARFUnitSection not a
>>> container itself (but rather to have a "units" member on it) - I don't feel
>>> strongly, but we do have at least some prior art/tradition of having things
>>> /be/ containers in LLVM (Modules are containers of functions, functions are
>>> containers of basic blocks, basic blocks are containers of instructions,
>>> etc) so it's not a problem to do that & in some cases I think it's rather
>>> nice. Interested to hear your thoughts/perspective.
>>>
>>>
>>>> Once this is done a bunch of accessors from the context can go away
>>>> also.
>>>>
>>>> http://reviews.llvm.org/D5369
>>>
>>>
>>
>
>
> --
> Alexey Samsonov
> vonosmas at gmail.com
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20140917/a8be637b/attachment.html>