[PATCH] LowerDbgDeclare - get rid of redundant dbg.values when an alloca survives optimization

Mon Apr 14 16:01:12 PDT 2014

>> Actually this is looking more like I was wondering (your testcase is
>> better here too).
>>
>> Side note: Ugh, the offset 0 thing really needs to get fixed.
>
> I didn’t want to pollute the patch with unrelated fixes, so I opted to just document the status quo here. Good thing is that there’s not many of these fixmes left any more!
>

:)

>>
>> I think this patch is probably going in the right direction. It does,
>> however, seem to start to conflate the ideas of dbg.value and
>> dbg.declare. Could you start, perhaps, with what you see the
>> difference is between these two intrinsics after this set of patches?
>
> My understanding of the two is this:
>
> dbg.declare
> -----------
>
> A dbg.declare describes (a variable that is stored in) an llvm::AllocaInst. It is valid for the entire lifetime of the alloca it describes. Dbg.declares are stored in the MMI table and end up in DwarfDebug as FI locations that are valid for the entire (lexical) scope the variable is defined in. This is the behavior we see and expect at -O0.
>
> dbg.value
> ---------
>
> A dbg.value describes an llvm::Value. It is valid only for the lifetime of the Value. LowerDbgDeclare (-O1 and up) lowers each dbg.declare into multiple dbg.values — one for each use of the alloca described by the dbg.declare. On trunk, this happens only for uses that are stores and loads (AFAIK the only other type of use is as a byref function argument). SDag/FastIsel lower dbg.values into DBG_VALUEs on the MI layer. There is a known bug with calculating the ranges for a sequence of DBG_VALUEs, (this is what Alexey Samsonov’s recent patch was all about).
>

I agree that this is the state of the world right now.

> What does my patch do?
> It changes LowerDbgDeclare's behavior to inserts a dbg.value for *all* uses of the alloca, including byref args, such that after LowerDbgDeclare is finished, all dbg.declares are replaced by one or more dbg.values, and no dbg.declare is left. In the specific case of a byref argument, it will lower a dbg.declare (for an alloca) into a dbg.value for an alloca, describing the (indirect, offset 0) value that is stored in the alloca, but this does not contradict the above definition.
>

This seems like a good step.

>> I think we're going to stop needing dbg.declare here in general, but
>> I'd like to see a plan to get there from here.
>
> We’re moving in that direction. My goal for this time was to get rid of dbg.declares after LowerDbgDeclare, but keep the flow for -O0 unchanged.
>

OK. Do you have plans for O0 changes here?

For the final review would you mind uploading to phabricator? Might
make it a bit easier on me/Dave/anyone else. :)

-eric

>>
>> Thanks for all of the work here, it really does look like the right
>> general direction, but I think we need to make sure we're moving in a
>> deliberate direction.
>
> I’m very much in favor of getting this right rather than adding a hack that works for a specific case.
>
>>
>> -eric
>
>
> thanks!
> Adrian
>
>>
>> On Sun, Mar 30, 2014 at 11:50 AM, Adrian Prantl <aprantl at apple.com> wrote:
>>> Hi Eric and David,
>>>
>>> Here is the promised second part, which is actually independent from part 1. This improves the quality of debug info for optimized code a lot, particularly in the case where variables are passed by reference. Also, it solves the mystery of the lost DBG_VALUEs discovered earlier in this thread.
>>>
>>>    Debug info for optimized code: Support variables that are on the stack and
>>>    described by DBG_VALUEs during their lifetime.
>>>
>>>    Previously, when a variable was at a FrameIndex for any part of its
>>>    lifetime, this would shadow all other DBG_VALUEs and only a single
>>>    fbreg location would be emitted, which in fact is only valid for a small
>>>    range and not the entire lexical scope of the variable. The included
>>>    dbg-value-const-byref testcase demonstrates this.
>>>
>>>    This patch fixes this by
>>>    Local.cpp
>>>    - emitting dbg.value intrinsics for allocas that are passed by reference
>>>    - dropping all dbg.declares (they are now fully lowered to dbg.values)
>>>    SelectionDAG
>>>    - renamed constructors for SDDbgValue for better readability.
>>>    - fix UserValue::match() to handle indirect values correctly
>>>    - not inserting an MMI table entries for dbg.values that describe allocas.
>>>    - lowering dbg.values that describe allocas into *indirect* DBG_VALUEs.
>>>    CodeGenPrepare
>>>    - leaving dbg.values for an alloca were they are (see comment)
>>>    DwarfDebug
>>>    - drive-by fix for Merge() not handling constant values correctly, testcase
>>>      included.
>>>    Other
>>>    - regenerated/updated instcombine-intrinsics testcase and included source
>>>
>>>
>>>
>>>
>>>
>>>
>>> cheers,
>>> adrian
>>>
>>> On Mar 27, 2014, at 1:54 PM, Adrian Prantl <aprantl at apple.com> wrote:
>>>
>>>> Hi Eric, David,
>>>>
>>>> time to revive age-old threads! I looked at this again, and here is the first in a series of patches that aim at improving debugging for optimized code.
>>>>
>>>> This one eliminates redundant variables being emitted when LowerDbgDeclare is hoisting a dbg.declare into a different lexical block as it happens in the included testcase.
>>>>
>>>> Next up: dealing with variables that are located at a FrameIndex and described DBG_VALUEs in different parts of the same function.
>>>>
>>>> -- adrian
>>>>
>>>> <0001-Debug-info-Allow-collectVariableInfoFromMMITable-to-.patch>
>>>>
>>>>
>>>> On Oct 29, 2013, at 1:23 PM, Eric Christopher <echristo at gmail.com> wrote:
>>>>
>>>>>
>>>>>
>>>>>
>>>>> On Tue, Oct 29, 2013 at 1:10 PM, David Blaikie <dblaikie at gmail.com> wrote:
>>>>>
>>>>>
>>>>>
>>>>> On Tue, Oct 29, 2013 at 1:06 PM, Adrian Prantl <aprantl at apple.com> wrote:
>>>>>
>>>>> On Oct 29, 2013, at 11:51, David Blaikie <dblaikie at gmail.com> wrote:
>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> On Tue, Oct 29, 2013 at 11:41 AM, Adrian Prantl <aprantl at apple.com> wrote:
>>>>>> Hi Eric and David,
>>>>>>
>>>>>> On Oct 21, 2013, at 17:07, Eric Christopher <echristo at gmail.com> wrote:
>>>>>> Cool deal. I've got some ideas on how to refactor/rewrite some of our
>>>>>>> loc handling anyhow, but it's going to be a few days before it's
>>>>>>> ready.
>>>>>>
>>>>>>
>>>>>> I had some time to look into this some more and I learned a couple of interesting things:
>>>>>> - The ##DEBUG_VALUE: comments in the asm output are really just comments do not necessarily show up in the DWARF. This is maybe not surprising, but what was surprising to me are the circumstances under which a DBG_VALUE makes it into the location list of a local variable:
>>>>>> - dbg.declare and dbg.value intrinsics are mutually exclusive; the presence of a dbg.declare will shadow any dbg.values that refer to the same variable. As soon as there is an entry in the MMI map (FunctionLowering::set() in FunctionLowering.cpp:134), the dbg.values will be ignored by DwarfDebug::collectVariableInfo (DwarfDebug.cpp::1366).
>>>>>> - Simply disabling that check will result in two separate DIEs for the same variable.
>>>>>>
>>>>>> This means that while my patch was correct based on how DwarfDebug currently behaves, that behavior is broken and I should instead allow for dbg.values and dbg.declares to be coalesced into a single location list.
>>>>>>
>>>>>> Before I dive into this I'd be curious to hear more about your ideas for location handling that you mentioned in your last mail, so we don't accidentally evolve the code into two different directions.
>>>>>>
>>>>>>
>>>>>> I also found a second, related issue that I'd like to tackle.
>>>>>>
>>>>>> This example:
>>>>>>
>>>>>>> foo(int map)
>>>>>>> {
>>>>>>> lookup(&map);
>>>>>>> if (!verify(map)) {  }
>>>>>>> }
>>>>>>>
>>>>>>
>>>>>> results in the following DWARF:
>>>>>>
>>>>>>> 0x00000027:     TAG_subprogram [2] *
>>>>>>>                AT_name( "foo" )
>>>>>>>                AT_low_pc( 0x0000000000000000 )
>>>>>>>                AT_high_pc( 0x0000000000000026 )
>>>>>>>             ...
>>>>>>> 0x00000046:         TAG_formal_parameter [3]
>>>>>>>                    AT_name( "map" )
>>>>>>>                    AT_type( {0x00000075} ( int ) )
>>>>>>>                    AT_location( fbreg -4 )
>>>>>>>
>>>>>>> 0x00000054:         TAG_lexical_block [4] *
>>>>>>>                    AT_low_pc( 0x0000000000000016 )
>>>>>>>                    AT_high_pc( 0x0000000000000020 )
>>>>>>>
>>>>>>> 0x00000065:             TAG_formal_parameter [3]
>>>>>>>                        AT_name( "map" )
>>>>>>>                        AT_type( {0x00000075} ( int ) )
>>>>>>>                        AT_location( fbreg -4 )
>>>>>>
>>>>>> I would like teach DWARFDebug::constructScopeDIE() to recognize if a variable DIE describes a "subset" of a DIE in one of its parent lexical blocks.
>>>>>>
>>>>>> To be clear, this just looks buggy - we shouldn't have two formal parameters here... there is only one formal parameter to 'foo'. If we create two DIEs that's a bug, we shouldn't do that.
>>>>>>
>>>>>
>>>>> That would be a fairly trivial fix in DWARFDebug::createScopeChildrenDIE(). I'm wondering: Should there ever be more than one DIE describing a variable in different lexical scopes or should there be only ever a single DIE with multiple entries in the DW_AT_location list?
>>>>>
>>>>> The latter, I'm fairly sure.
>>>>>
>>>>> Yes.
>>>>>
>>>>> -eric
>>>>
>>>
>>>
>