[PATCH] D32821: Add DWARF verifiers to verify address ranges are correct and scoped correctly.

Adrian Prantl via llvm-commits llvm-commits at lists.llvm.org
Thu May 4 09:57:11 PDT 2017


> On May 4, 2017, at 9:46 AM, David Blaikie <dblaikie at gmail.com> wrote:
> 
> 
> 
> On Thu, May 4, 2017 at 9:39 AM Greg Clayton <clayborg at gmail.com <mailto:clayborg at gmail.com>> wrote:
>> On May 3, 2017, at 7:38 PM, David Blaikie <dblaikie at gmail.com <mailto:dblaikie at gmail.com>> wrote:
>> 
>> For illustrative purposes I've attached a version (a patch on top of your patch) of some of what I had in mind - there's a few FIXMEs in there where I stumbled across untested functionality in your patch - so whichever patch/direction ends up being used, there's some missing test coverage to address.
>> 
>> One of the big things I'd misunderstood/hadn't expected was that the visit(DIE) function wasn't recursive. Though the approach I've taken could be implemented without the recursion (checking the depth of nodes and reducing the size of the vector as depths are traversed) I think it's probably a bit easier to read this way.
>> 
>> It seems to simplify things quite a bit, to my mind - and adds a few missing cases in multiple directions.
> 
> I like the recursion aspect.
> 
>> 
>> It looks like the original patch doesn't catch overlap between two lexical scopes in the same subprogram at the same level?
> 
> Yep it doesn't. I will add that.
> 
>> Also it looks like the overlapping function code only considers pairs ({1, 2}, {2, 3}, and so on) - so what about if {1, 3} overlap but neither overlap 2?
> 
> DieRangeInfos[0] = DW_TAG_compile_unit [0x0000000000001000 - 0x0000000000003000)
> DieRangeInfos[1] = DW_TAG_subprogram   [0x0000000000001000 - 0x0000000000002000)
> DieRangeInfos[2] = DW_TAG_lexical_block[0x0000000000001100 - 0x0000000000001200)
> DieRangeInfos[3] = DW_TAG_subprogram   [0x0000000000003000 - 0x0000000000004000)
> DieRangeInfos[4] = DW_TAG_lexical_block[0x0000000000003100 - 0x0000000000003200)
> 
> For the example above we check if any DW_TAG_subprogram is contained in DW_TAG_compile_unit if it has ranges, so it would check [0] contains [1], and [0] contains [2]. There would be no need to check if [1] or [2] contain [3] since it assumed to be its own function. Granted I haven't seen DWARF like this. But the stack still would easily verify all lexical blocks and inlined subroutines correctly , except the overlap between sibling blocks/inlined functions.
> 
> Each DW_TAG_subprogram would check any contained DW_TAG_lexical_block or DW_TAG_inlined_subroutine ([1] contains [2], [3] contains [4]). The patch assumes if for some reason a DW_TAG subprogram is contained within anything else that is is stand alone.
> 
> Adrian (Paul?) any thoughts on this? I know it's perhaps overly simplistic, but I like the idea of implementing the "if this DIE has ranges, those ranges must be a subrange of the nearest parent with ranges and not overlap with any other children of that parent" rule.
> 
> This is perhaps overly restrictive if you allow for a subprogram defined within another subprogram (where the inner one isn't strictly part of the address range of the outer?)

Wouldn't this break for Pascal-style inner functions, where the inner function's DW_TAG_subprogram is a child of the outer one's, but the address ranges are entirely non-overlapping?

-- adrian

> and overly permissive in that it would allow a lexical_block without ranges to be the parent to a lexical_block with ranges. (though perhaps it's better to catch that with a separate pass about which blocks should and shouldn't have ranges)
>  
> I'll come up with a hybrid of the two and repost.
> 
> 
>> (& I also left some code that would catch the case where ranges within a single range list would overlap - perhaps not wrong, but seems a bit broken?)
> 
>> 
>> On Wed, May 3, 2017 at 2:17 PM Greg Clayton <clayborg at gmail.com <mailto:clayborg at gmail.com>> wrote:
>> There should not have been a DW_AT_low_pc on the class type...
>> 
>>> On May 3, 2017, at 2:15 PM, Greg Clayton <clayborg at gmail.com <mailto:clayborg at gmail.com>> wrote:
>>> 
>>>> 
>>>> On May 3, 2017, at 1:47 PM, David Blaikie <dblaikie at gmail.com <mailto:dblaikie at gmail.com>> wrote:
>>>> 
>>>> 
>>>> 
>>>> On Wed, May 3, 2017 at 1:43 PM Greg Clayton via Phabricator <reviews at reviews.llvm.org <mailto:reviews at reviews.llvm.org>> wrote:
>>>> clayborg added a comment.
>>>> 
>>>> One thing to clarify:
>>>> 
>>>>   std::vector<RangeInfoType> DieRangeInfos;
>>>> 
>>>> is a vector that only gets as deep as the DIE depth. It doesn't get every range for every function, block, or inlined function, it just has a stack of ranges for the _current_ DIE depth.
>>>> 
>>>> Which is part of what I'm confused by - I wouldn't expect to see/don't understand why there's any code that special-cases subprograms here. They don't seem special to me.
>>> 
>>> They are special because they are top level functions. DW_TAG_subprogram appear at any depth and their parent isn't always a DW_TAG_compile_unit. So we might have DWARF like:
>>> 
>>> 0x0000000b: DW_TAG_compile_unit [1] *
>>>              DW_AT_low_pc( 0x0000000000001000 )
>>>              DW_AT_high_pc( 0x0000000000003000 )
>>>              DW_AT_name( "/tmp/main.c" )
>>> 
>>> 0x00000020:     DW_TAG_subprogram [2] *
>>>                  DW_AT_name( "main" )
>>>                  DW_AT_low_pc( 0x0000000000001000 )
>>>                  DW_AT_high_pc( 0x0000000000002000 )
>>> 
>>> 0x00000035:         DW_TAG_lexical_block [3]  
>>>                      DW_AT_low_pc( 0x0000000000001100 )
>>>                      DW_AT_high_pc( 0x0000000000001200 )
>>> 
>>> 0x00000046:         NULL
>>> 
>>> 0x00000047:     DW_TAG_class_type [4] *
>>>                  DW_AT_low_pc( 0x0000000000001000 )
>>>                  DW_AT_name( "Foo" )
>>> 
>>> 0x00000054:         DW_TAG_subprogram [2] *
>>>                      DW_AT_name( "Bar" )
>>>                      DW_AT_low_pc( 0x0000000000002000 )
>>>                      DW_AT_high_pc( 0x0000000000003000 )
>>> 
>>> 0x00000069:             DW_TAG_lexical_block [3]  
>>>                          DW_AT_low_pc( 0x0000000000002100 )
>>>                          DW_AT_high_pc( 0x0000000000002200 )
>>> 
>>> 0x0000007a:             NULL
>>> 
>>> 0x0000007b:         NULL
>>> 
>>> 0x0000007c:     NULL
>>> 
>>> When visiting the DW_TAG_subprogram at 0x00000020 we would do:
>>> 
>>> Depth = Die.getDepth(); // depth will be 1
>>> if (!DieRangeInfos[0].Ranges.empty())
>>>   assert(DieRangeInfos[0].Contains(DieRangeInfos[Depth].Ranges)
>>> 
>>> since 0x00000020 is at depth 1
>>> 
>>> Our DieRangeInfos stack would be:
>>> 
>>> DieRangeInfos[0] = DW_TAG_compile_unit [0x0000000000001000 - 0x0000000000003000)
>>> DieRangeInfos[1] = DW_TAG_subprogram   [0x0000000000001000 - 0x0000000000002000)
>>> 
>>> When visiting the DW_TAG_lexical_block at 0x00000035 we would only check:
>>> 
>>> assert(DieRangeInfos[Depth-1].Contains(DieRangeInfos[Depth].Ranges))
>>> 
>>> Our DieRangeInfos stack would be:
>>> 
>>> DieRangeInfos[0] = DW_TAG_compile_unit [0x0000000000001000 - 0x0000000000003000)
>>> DieRangeInfos[1] = DW_TAG_subprogram   [0x0000000000001000 - 0x0000000000002000)
>>> DieRangeInfos[2] = DW_TAG_lexical_block[0x0000000000001100 - 0x0000000000001200)
>>> 
>>> When visiting 0x00000054 we would again check against the compile units ranges:
>>> 
>>> Depth = Die.getDepth(); // depth will be 2
>>> if (!DieRangeInfos[0].Ranges.empty())
>>>   assert(DieRangeInfos[0].Contains(DieRangeInfos[Depth].Ranges)
>>> 
>>> DieRangeInfos[0] = DW_TAG_compile_unit [0x0000000000001000 - 0x0000000000003000)
>>> DieRangeInfos[1] = DW_TAG_class_type
>>> DieRangeInfos[2] = DW_TAG_subprogram   [0x0000000000002000 - 0x0000000000003000)
>>> 
>>> Then visiting 0x00000069 we do:
>>> 
>>> assert(DieRangeInfos[Depth-1].Contains(DieRangeInfos[Depth].Ranges))
>>> 
>>> DieRangeInfos[0] = DW_TAG_compile_unit [0x0000000000001000 - 0x0000000000003000)
>>> DieRangeInfos[1] = DW_TAG_class_type
>>> DieRangeInfos[2] = DW_TAG_subprogram   [0x0000000000002000 - 0x0000000000003000)
>>> DieRangeInfos[3] = DW_TAG_lexical_block[0x0000000000002100 - 0x0000000000002200)
>>> 
>>> 
>>>> I'd expect an algorithm that starts at the CU DIE and looks for any DIE that has ranges. It wouldn't need special cases for any tag types.
>>> 
>>> You only need to verify your range against something your parent. For some, like a DW_TAG_subprogram, it must be in the DW_TAG_compile_unit only if it has ranges. For DW_TAG_lexical_block and DW_TAG_inlined_subroutine, they MUST be in their parents if their ranges are valid.
>> 
>> Fair point - I was trying to think of if there were any particular scopes inside a function that could not have ranges, guess not. I could see adding that as a direct case in my example ("if tag type is <CU/lexical_block/inlined_subroutine> it must have ranges" - actually, though, now I reflect on that, that isn't true - abstract origins will have lexical blocks but they won't have any ranges on them) 
>>> 
>>> It is also possible in other languages to have functions declared inside of functions.
>> 
>> 
>> Do you have some concrete examples of this? Would be good to see wat sort of DWARF existing providers/consumers are using here.
>>  
>>> In that case you want to only verify against your parent and stop at the DW_TAG_subprogram level. You could have a stack like:
>>> 
>>> 
>>> DieRangeInfos[0] = DW_TAG_compile_unit [0x0000000000001000 - 0x0000000000003000)
>>> DieRangeInfos[1] = DW_TAG_subprogram   [0x0000000000001000 - 0x0000000000002000)
>>> DieRangeInfos[2] = DW_TAG_lexical_block[0x0000000000001100 - 0x0000000000001200)
>>> DieRangeInfos[3] = DW_TAG_subprogram   [0x0000000000003000 - 0x0000000000004000)
>>> DieRangeInfos[4] = DW_TAG_lexical_block[0x0000000000003100 - 0x0000000000003200)
>>> 
>>> You would only want to see if:
>>> - [4] is in [3]
>>> - [2] is in [1]
>>> - [3] and [1] are in [0]
>>> 
>>> Make sense?
>> 
>> 
>> At least the DWARF I've seen for cases like this would have the function declared locally inside the outer function, but the DWARF for the definition would be separate/outside. I'd be curious to see a producer/consumer using this nested idiom. Do you have an example?
>>  
> 
>> <die_verification.diff>
> 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20170504/bb8134c4/attachment.html>


More information about the llvm-commits mailing list