[PATCH] Avoid empty .debug_loc entries and lists

David Blaikie dblaikie at gmail.com
Fri Jun 5 10:17:19 PDT 2015


On Fri, Jun 5, 2015 at 10:05 AM, Adrian Prantl <aprantl at apple.com> wrote:

>
> On Jun 5, 2015, at 9:45 AM, David Blaikie <dblaikie at gmail.com> wrote:
>
>
>
> On Fri, Jun 5, 2015 at 9:34 AM, Adrian Prantl <aprantl at apple.com> wrote:
>
>>
>> On Jun 4, 2015, at 5:33 PM, David Blaikie <dblaikie at gmail.com> wrote:
>>
>>
>>
>> On Thu, Jun 4, 2015 at 5:23 PM, Duncan P. N. Exon Smith <
>> dexonsmith at apple.com> wrote:
>>
>>>
>>> > On 2015 Jun 4, at 16:50, David Blaikie <dblaikie at gmail.com> wrote:
>>> >
>>> >
>>> >
>>> > On Thu, Jun 4, 2015 at 4:46 PM, Duncan P. N. Exon Smith <
>>> dexonsmith at apple.com> wrote:
>>> >
>>> > > On 2015 Jun 4, at 16:36, David Blaikie <dblaikie at gmail.com> wrote:
>>> > >
>>> > >
>>> > >
>>> > > On Thu, Jun 4, 2015 at 4:28 PM, Duncan P. N. Exon Smith <
>>> dexonsmith at apple.com> wrote:
>>> > >
>>> > > > On 2015 Jun 4, at 16:09, David Blaikie <dblaikie at gmail.com> wrote:
>>> > > >
>>> > > > Judging by the code, does this cause us not to create a variable
>>> if it has no locations? That's probably not quite right - even if we have
>>> no locations, it's probably a good idea/correct to describe the variable so
>>> shadowing, etc, works correctly (if someone tries to print the variable the
>>> compiler still finds the right variable but just reports that it doesn't
>>> know the location)
>>> > >
>>> > > So the code here already has an early return if there are no ranges:
>>> > >
>>> > >     for (const auto &I : DbgValues) {
>>> > >       InlinedVariable IV = I.first;
>>> > >       if (Processed.count(IV))
>>> > >         continue;
>>> > >
>>> > >       // Instruction ranges, specifying where IV is accessible.
>>> > >       const auto &Ranges = I.second;
>>> > >       if (Ranges.empty())
>>> > >         continue;
>>> > >
>>> > > My patch effectively hits this `continue` more often.  The difference
>>> > > is cases where we *thought* we knew how to emit ranges, but in fact
>>> > > didn't know how.
>>> > >
>>> > > I suppose I assumed the intended behaviour was "don't emit the
>>> > > variable if we don't have any ranges for it" since that's what was
>>> > > going on, but are you saying this `continue` is a bug?  (Or am I
>>> > > totally missing your point?)
>>> > >
>>> > > How do we end up with an empty range list here? That seems strange -
>>> I assume we're just lazily creating the rang lists in
>>> calculateDbgValueHistory... (I could sort of understand empty entries in a
>>> range list - when it turns out the dbg_value intrinsic describes no
>>> instructions due to things being optimized away, hoisted here or there,
>>> etc).
>>> >
>>> > No idea, I just assume it happens because there's code.
>>> >
>>> > Not sure what you mean.
>>> >
>>> > What I'm trying to understand is whether or not we should be fixing
>>> this bug closer to the source (in calculateDbgValueHistory) that's
>>> producing empty ranges or empty entries.
>>>
>>> One thing that Adrian walked me through is that even if we have
>>> ranges, and we have entries after the call to `buildLocationList()`,
>>> we can still fail to emit any bytes during `DebugLocEntry::finalize()`
>>> since we don't know how to emit every DWARF expression.
>>>
>>
>> How are we building DWARF expressions we can't emit? Shouldn't we just
>> not build those? (I guess this is more of a question for Adrian)
>>
>>
>> There are two cases that I’m aware of: Constant float expressions and
>> MachineRegisters that cannot be synthesized in DWARF. Constant floats
>> cannot be expressed in DWARF. DwarfExpression is trying very hard to
>> synthesize registers that that don’t have a DWARF register number out of
>> combinations/pieces of sub- and super-registers but may fail. Basically we
>> have to try to build the expression before we know whether it can be built
>> or not.
>>
>
> Sure - but at the end, if we decide we couldn't build the expression,
> wouldn't we just throw out the whole entry? (& if we end up with no
> entries, we'd not add the range list, etc)
>
>
>>
>>
>>
>>> Because of that, I changed my approach from "trying to get `Ranges` to
>>> match up with `Entries`" to "popping empty/useless entries and ranges
>>> off the stack", and figured Ranges/Entries inconsistencies were a
>>> compile-time optimization that weren't as urgent.
>>>
>>
>> My perspective is just to understand why they're there in the first place
>> - I dislike solutions/fixes/patches that are predicated on "I don't know
>> why this state occurs, but I can handle it here" - they feel unsatisfying &
>> can end up layering workarounds on workarounds when there was some
>> underlying issues that could've been addressed & lead to simpler code with
>> stronger invariants.
>>
>> But if there's a reason for them (Empty range lists or empty ranges in
>> the range lists) that's cool - I just don't know it & wanted to understand
>> it before I'd be comfortable with the patch.
>>
>>
>> I’m pretty confident that empty ranges only happen because we use a (too)
>> simple heuristic to decide wether to emit a location in-line or as a
>> .debug_loc entry. (That heuristic being whether there is only a single
>> DBG_VALUE per bit_piece of the variable).
>>
>
> I'm a bit confused - I figured we'd build up the range list - then, if the
> length of that list was one we'd use an in-line location otherwise we'd use
> a debug_loc entry. (& if the length of the list was zero, we wouldn't add
> it in the first place)
>
>
> That would off course be much saner. In fact Duncan’s patch gets us there.
>

Perhaps I've lost all basis for this discussion, sorry - it looked like
Duncan's patch it filters the results after DbgValues and the ranges it
contains have been built.

What I'm trying to understand is what the motivation/justification is for
not enforcing a stronger invariant on the code that builds this data
structure (calculateDbgValueHistory) so it never produces empty ranges in
this list in the first place.

(sorry, I realize the ambiguity now - I said "avoid empty range entries"
and you reasonably read that as referring to the output -  when what I was
trying to refer to was the internal data structure "DbgValues" that
contains the range data we're working from)


>
> It doesn’t handle the single entry special case, but it also shouldn’t: I
> just realized that it is not necessarily correct to emit a single-entry
> debug_loc entry inline, because in a case like:
>
>   l1: DBG_VALUE, var, reg0
>   ...
>   l2: DBG_VALUE, var, noreg
>   ...
>
> we’d still need to emit a single-entry debug_loc entry to describe that
> the reg0 location is only valid in the range [l1..l2).
>

Agreed.


>
>
>
>> If building one or more expressions fails then we may end up with one
>> (should have been inline instead) or zero (shouldn’t have a DW_AT_location
>> at all) .debug_loc entries.
>>
>
> Can we practically/easily remove those at the source?
>
>
> Yes: DwarfDebug::collectVariableInfo() should invoke
> RegVar->setDebugLocListIndex() _after_ buildLocationList() and only if the
> location list is nonempty.
>
> -- adrian
>
>
>
>>
>> -- adrian
>>
>>
>>
>>> Maybe we should also be adding the commented out code I had in:
>>>
>>> http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20150525/278797.html
>>>
>>> It also seems like the `continue` in the current code might be the wrong
>>> thing (and we should create the variable with no DW_AT_location?).
>>>
>>
>> As you mentioned, the loop that catches optimized-out variables is
>> probably fine for all these cases (because there are cases where all the
>> dbg_values could be optimized away & we'd find no evidence of the variable
>> at all, I should imagine - so that logic has to be up to the task of coping
>> with these cases).
>>
>>
>>> (All of this stuff is a tangent for me.  I wandered in here to remove
>>> the empty .debug_loc entries/lists in the output.)
>>>
>>
>> Right - just trying to make sure I understand why this is the right place
>> to fix that issue & not working around some other issue/missing invariant.
>>
>>
>>> > What's the right thing to do?
>>> >
>>> > I don't have all the context here to know that off-hand. Trying to
>>> discuss it to try to wrap my head around enough of it to offer a useful
>>> opinion - sorry it takes a while.
>>>
>>> Not trying to rush you, sorry if it seems that way.  This isn't really
>>> blocking me (I have other things to work on (although obviously I'd
>>> like to resolve this some time)).
>>>
>>> > Should we create a variable the same
>>> > was as the for loop below for function arguments (that doesn't
>>> > attach ranges)?  Should we not create a variable?
>>> >
>>> > If we get the same output between creating it here or in the later
>>> loop, it doesn't matter in the broad sense.
>>> >
>>> > Is it actually
>>> > *useful* that we create a .debug_loc list with no entries and point
>>> > to it?
>>> >
>>> > No, there's no benefit to pointing to an empty loc list compared to
>>> just having no DW_AT_location at all.
>>>
>>> Thanks for confirming.
>>>
>>> I'd be happy to adjust my patch to just address the empty .debug_loc
>>> list/entries and continue to create the same variables that were being
>>> created before (just missing whatever magic makes them create a
>>> `DW_AT_location` -- I'm guessing I just use the constructor logic
>>> that's used for optimized-out function parameters in the loop that
>>> follows?).
>>>
>>
>> No, I think that's fine as is - as you pointed out, we have that other
>> loop for emitting optimized-out variables and we have other logic that's
>> skipping variables here already.
>>
>>
>>> (It sounds like it was a poor assumption on my part that we were trying
>>> to avoid emitting variables if they have no ranges.)
>>
>>
>>
>>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20150605/a2d86098/attachment.html>


More information about the llvm-commits mailing list