[PATCH] D32821: Add DWARF verifiers to verify address ranges are correct and scoped correctly.

David Blaikie via llvm-commits llvm-commits at lists.llvm.org
Thu May 4 09:46:54 PDT 2017


On Thu, May 4, 2017 at 9:39 AM Greg Clayton <clayborg at gmail.com> wrote:

> On May 3, 2017, at 7:38 PM, David Blaikie <dblaikie at gmail.com> wrote:
>
> For illustrative purposes I've attached a version (a patch on top of your
> patch) of some of what I had in mind - there's a few FIXMEs in there where
> I stumbled across untested functionality in your patch - so whichever
> patch/direction ends up being used, there's some missing test coverage to
> address.
>
> One of the big things I'd misunderstood/hadn't expected was that the
> visit(DIE) function wasn't recursive. Though the approach I've taken could
> be implemented without the recursion (checking the depth of nodes and
> reducing the size of the vector as depths are traversed) I think it's
> probably a bit easier to read this way.
>
> It seems to simplify things quite a bit, to my mind - and adds a few
> missing cases in multiple directions.
>
>
> I like the recursion aspect.
>
>
> It looks like the original patch doesn't catch overlap between two lexical
> scopes in the same subprogram at the same level?
>
>
> Yep it doesn't. I will add that.
>
> Also it looks like the overlapping function code only considers pairs ({1,
> 2}, {2, 3}, and so on) - so what about if {1, 3} overlap but neither
> overlap 2?
>
>
> DieRangeInfos[0] = DW_TAG_compile_unit [0x0000000000001000 -
> 0x0000000000003000)
> DieRangeInfos[1] = DW_TAG_subprogram   [0x0000000000001000 -
> 0x0000000000002000)
> DieRangeInfos[2] = DW_TAG_lexical_block[0x0000000000001100 -
> 0x0000000000001200)
> DieRangeInfos[3] = DW_TAG_subprogram   [0x0000000000003000 -
> 0x0000000000004000)
> DieRangeInfos[4] = DW_TAG_lexical_block[0x0000000000003100 -
> 0x0000000000003200)
>
> For the example above we check if any DW_TAG_subprogram is contained
> in DW_TAG_compile_unit if it has ranges, so it would check [0] contains
> [1], and [0] contains [2]. There would be no need to check if [1] or [2]
> contain [3] since it assumed to be its own function. Granted I haven't seen
> DWARF like this. But the stack still would easily verify all lexical blocks
> and inlined subroutines correctly , except the overlap between sibling
> blocks/inlined functions.
>
> Each DW_TAG_subprogram would check any contained DW_TAG_lexical_block or
> DW_TAG_inlined_subroutine ([1] contains [2], [3] contains [4]). The patch
> assumes if for some reason a DW_TAG subprogram is contained within anything
> else that is is stand alone.
>

Adrian (Paul?) any thoughts on this? I know it's perhaps overly simplistic,
but I like the idea of implementing the "if this DIE has ranges, those
ranges must be a subrange of the nearest parent with ranges and not overlap
with any other children of that parent" rule.

This is perhaps overly restrictive if you allow for a subprogram defined
within another subprogram (where the inner one isn't strictly part of the
address range of the outer?) and overly permissive in that it would allow a
lexical_block without ranges to be the parent to a lexical_block with
ranges. (though perhaps it's better to catch that with a separate pass
about which blocks should and shouldn't have ranges)


> I'll come up with a hybrid of the two and repost.
>
>
> (& I also left some code that would catch the case where ranges within a
> single range list would overlap - perhaps not wrong, but seems a bit
> broken?)
>
>
>
> On Wed, May 3, 2017 at 2:17 PM Greg Clayton <clayborg at gmail.com> wrote:
>
>> There should not have been a DW_AT_low_pc on the class type...
>>
>> On May 3, 2017, at 2:15 PM, Greg Clayton <clayborg at gmail.com> wrote:
>>
>>
>> On May 3, 2017, at 1:47 PM, David Blaikie <dblaikie at gmail.com> wrote:
>>
>>
>>
>> On Wed, May 3, 2017 at 1:43 PM Greg Clayton via Phabricator <
>> reviews at reviews.llvm.org> wrote:
>>
>>> clayborg added a comment.
>>>
>>> One thing to clarify:
>>>
>>>   std::vector<RangeInfoType> DieRangeInfos;
>>>
>>> is a vector that only gets as deep as the DIE depth. It doesn't get
>>> every range for every function, block, or inlined function, it just has a
>>> stack of ranges for the _current_ DIE depth.
>>>
>>
>> Which is part of what I'm confused by - I wouldn't expect to see/don't
>> understand why there's any code that special-cases subprograms here. They
>> don't seem special to me.
>>
>>
>> They are special because they are top level functions. DW_TAG_subprogram
>> appear at any depth and their parent isn't always a DW_TAG_compile_unit. So
>> we might have DWARF like:
>>
>> 0x0000000b: DW_TAG_compile_unit [1] *
>>              DW_AT_low_pc( 0x0000000000001000 )
>>              DW_AT_high_pc( 0x0000000000003000 )
>>              DW_AT_name( "/tmp/main.c" )
>>
>> 0x00000020:     DW_TAG_subprogram [2] *
>>                  DW_AT_name( "main" )
>>                  DW_AT_low_pc( 0x0000000000001000 )
>>                  DW_AT_high_pc( 0x0000000000002000 )
>>
>> 0x00000035:         DW_TAG_lexical_block [3]
>>                      DW_AT_low_pc( 0x0000000000001100 )
>>                      DW_AT_high_pc( 0x0000000000001200 )
>>
>> 0x00000046:         NULL
>>
>> 0x00000047:     DW_TAG_class_type [4] *
>>                  DW_AT_low_pc( 0x0000000000001000 )
>>                  DW_AT_name( "Foo" )
>>
>> 0x00000054:         DW_TAG_subprogram [2] *
>>                      DW_AT_name( "Bar" )
>>                      DW_AT_low_pc( 0x0000000000002000 )
>>                      DW_AT_high_pc( 0x0000000000003000 )
>>
>> 0x00000069:             DW_TAG_lexical_block [3]
>>                          DW_AT_low_pc( 0x0000000000002100 )
>>                          DW_AT_high_pc( 0x0000000000002200 )
>>
>> 0x0000007a:             NULL
>>
>> 0x0000007b:         NULL
>>
>> 0x0000007c:     NULL
>>
>> When visiting the DW_TAG_subprogram at 0x00000020 we would do:
>>
>> Depth = Die.getDepth(); // depth will be 1
>> if (!DieRangeInfos[0].Ranges.empty())
>>   assert(DieRangeInfos[0].Contains(DieRangeInfos[Depth].Ranges)
>>
>> since 0x00000020 is at depth 1
>>
>> Our DieRangeInfos stack would be:
>>
>> DieRangeInfos[0] = DW_TAG_compile_unit [0x0000000000001000 -
>> 0x0000000000003000)
>> DieRangeInfos[1] = DW_TAG_subprogram   [0x0000000000001000 -
>> 0x0000000000002000)
>>
>> When visiting the DW_TAG_lexical_block at 0x00000035 we would only check:
>>
>> assert(DieRangeInfos[Depth-1].Contains(DieRangeInfos[Depth].Ranges))
>>
>> Our DieRangeInfos stack would be:
>>
>> DieRangeInfos[0] = DW_TAG_compile_unit [0x0000000000001000 -
>> 0x0000000000003000)
>> DieRangeInfos[1] = DW_TAG_subprogram   [0x0000000000001000 -
>> 0x0000000000002000)
>> DieRangeInfos[2]
>> = DW_TAG_lexical_block[0x0000000000001100 - 0x0000000000001200)
>>
>> When visiting 0x00000054 we would again check against the compile units
>> ranges:
>>
>> Depth = Die.getDepth(); // depth will be 2
>> if (!DieRangeInfos[0].Ranges.empty())
>>   assert(DieRangeInfos[0].Contains(DieRangeInfos[Depth].Ranges)
>>
>> DieRangeInfos[0] = DW_TAG_compile_unit [0x0000000000001000 -
>> 0x0000000000003000)
>> DieRangeInfos[1] = DW_TAG_class_type
>> DieRangeInfos[2] = DW_TAG_subprogram
>> [0x0000000000002000 - 0x0000000000003000)
>>
>> Then visiting 0x00000069 we do:
>>
>> assert(DieRangeInfos[Depth-1].Contains(DieRangeInfos[Depth].Ranges))
>>
>> DieRangeInfos[0] = DW_TAG_compile_unit [0x0000000000001000 -
>> 0x0000000000003000)
>> DieRangeInfos[1] = DW_TAG_class_type
>> DieRangeInfos[2] = DW_TAG_subprogram
>> [0x0000000000002000 - 0x0000000000003000)
>> DieRangeInfos[3]
>> = DW_TAG_lexical_block[0x0000000000002100 - 0x0000000000002200)
>>
>>
>> I'd expect an algorithm that starts at the CU DIE and looks for any DIE
>> that has ranges. It wouldn't need special cases for any tag types.
>>
>>
>> You only need to verify your range against something your parent. For
>> some, like a DW_TAG_subprogram, it must be in the DW_TAG_compile_unit only
>> if it has ranges. For DW_TAG_lexical_block and DW_TAG_inlined_subroutine,
>> they MUST be in their parents if their ranges are valid.
>>
>> Fair point - I was trying to think of if there were any particular scopes
> inside a function that could not have ranges, guess not. I could see adding
> that as a direct case in my example ("if tag type is
> <CU/lexical_block/inlined_subroutine> it must have ranges" - actually,
> though, now I reflect on that, that isn't true - abstract origins will have
> lexical blocks but they won't have any ranges on them)
>
>>
>> It is also possible in other languages to have functions declared inside
>> of functions.
>>
>>
> Do you have some concrete examples of this? Would be good to see wat sort
> of DWARF existing providers/consumers are using here.
>
>
>> In that case you want to only verify against your parent and stop at the
>> DW_TAG_subprogram level. You could have a stack like:
>>
>>
>> DieRangeInfos[0] = DW_TAG_compile_unit [0x0000000000001000 -
>> 0x0000000000003000)
>> DieRangeInfos[1] = DW_TAG_subprogram   [0x0000000000001000 -
>> 0x0000000000002000)
>> DieRangeInfos[2]
>> = DW_TAG_lexical_block[0x0000000000001100 - 0x0000000000001200)
>> DieRangeInfos[3] = DW_TAG_subprogram   [0x0000000000003000 -
>> 0x0000000000004000)
>> DieRangeInfos[4]
>> = DW_TAG_lexical_block[0x0000000000003100 - 0x0000000000003200)
>>
>> You would only want to see if:
>> - [4] is in [3]
>> - [2] is in [1]
>> - [3] and [1] are in [0]
>>
>> Make sense?
>>
>>
> At least the DWARF I've seen for cases like this would have the function
> declared locally inside the outer function, but the DWARF for the
> definition would be separate/outside. I'd be curious to see a
> producer/consumer using this nested idiom. Do you have an example?
>
>
> <die_verification.diff>
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20170504/5553e22c/attachment.html>


More information about the llvm-commits mailing list