[PATCH] D32821: Add DWARF verifiers to verify address ranges are correct and scoped correctly.

Greg Clayton via llvm-commits llvm-commits at lists.llvm.org
Thu May 4 11:05:44 PDT 2017


> On May 4, 2017, at 10:15 AM, David Blaikie <dblaikie at gmail.com> wrote:
> 
> 
> 
> On Thu, May 4, 2017 at 10:07 AM Adrian Prantl <aprantl at apple.com <mailto:aprantl at apple.com>> wrote:
>> On May 4, 2017, at 10:01 AM, David Blaikie <dblaikie at gmail.com <mailto:dblaikie at gmail.com>> wrote:
>> 
>> 
>> 
>> On Thu, May 4, 2017 at 9:57 AM Adrian Prantl <aprantl at apple.com <mailto:aprantl at apple.com>> wrote:
>>> On May 4, 2017, at 9:46 AM, David Blaikie <dblaikie at gmail.com <mailto:dblaikie at gmail.com>> wrote:
>>> 
>>> 
>>> 
>>> On Thu, May 4, 2017 at 9:39 AM Greg Clayton <clayborg at gmail.com <mailto:clayborg at gmail.com>> wrote:
>>>> On May 3, 2017, at 7:38 PM, David Blaikie <dblaikie at gmail.com <mailto:dblaikie at gmail.com>> wrote:
>>>> 
>>>> For illustrative purposes I've attached a version (a patch on top of your patch) of some of what I had in mind - there's a few FIXMEs in there where I stumbled across untested functionality in your patch - so whichever patch/direction ends up being used, there's some missing test coverage to address.
>>>> 
>>>> One of the big things I'd misunderstood/hadn't expected was that the visit(DIE) function wasn't recursive. Though the approach I've taken could be implemented without the recursion (checking the depth of nodes and reducing the size of the vector as depths are traversed) I think it's probably a bit easier to read this way.
>>>> 
>>>> It seems to simplify things quite a bit, to my mind - and adds a few missing cases in multiple directions.
>>> 
>>> I like the recursion aspect.
>>> 
>>>> 
>>>> It looks like the original patch doesn't catch overlap between two lexical scopes in the same subprogram at the same level?
>>> 
>>> Yep it doesn't. I will add that.
>>> 
>>>> Also it looks like the overlapping function code only considers pairs ({1, 2}, {2, 3}, and so on) - so what about if {1, 3} overlap but neither overlap 2?
>>> 
>>> DieRangeInfos[0] = DW_TAG_compile_unit [0x0000000000001000 - 0x0000000000003000)
>>> DieRangeInfos[1] = DW_TAG_subprogram   [0x0000000000001000 - 0x0000000000002000)
>>> DieRangeInfos[2] = DW_TAG_lexical_block[0x0000000000001100 - 0x0000000000001200)
>>> DieRangeInfos[3] = DW_TAG_subprogram   [0x0000000000003000 - 0x0000000000004000)
>>> DieRangeInfos[4] = DW_TAG_lexical_block[0x0000000000003100 - 0x0000000000003200)
>>> 
>>> For the example above we check if any DW_TAG_subprogram is contained in DW_TAG_compile_unit if it has ranges, so it would check [0] contains [1], and [0] contains [2]. There would be no need to check if [1] or [2] contain [3] since it assumed to be its own function. Granted I haven't seen DWARF like this. But the stack still would easily verify all lexical blocks and inlined subroutines correctly , except the overlap between sibling blocks/inlined functions.
>>> 
>>> Each DW_TAG_subprogram would check any contained DW_TAG_lexical_block or DW_TAG_inlined_subroutine ([1] contains [2], [3] contains [4]). The patch assumes if for some reason a DW_TAG subprogram is contained within anything else that is is stand alone.
>>> 
>>> Adrian (Paul?) any thoughts on this? I know it's perhaps overly simplistic, but I like the idea of implementing the "if this DIE has ranges, those ranges must be a subrange of the nearest parent with ranges and not overlap with any other children of that parent" rule.
>>> 
>>> This is perhaps overly restrictive if you allow for a subprogram defined within another subprogram (where the inner one isn't strictly part of the address range of the outer?)
>> 
>> Wouldn't this break for Pascal-style inner functions, where the inner function's DW_TAG_subprogram is a child of the outer one's, but the address ranges are entirely non-overlapping?
>> 
>> That's the question - do you have an example of DWARF there?
> 
> This is legal Swift code, for example:
> 
> func outer(_ a: Int64) -> Int64 {
>      func inner(_ b: Int64) -> Int64 {
>        return a+b
>      }
>      return inner(42)
> }
> 
> and the DWARF output is:
> 
> 0x00000059:         TAG_subprogram [4] *
>                      AT_low_pc( 0x0000000000000010 )
>                      AT_high_pc( 0x00000029 )
>                      AT_frame_base( rbp )
>                      AT_linkage_name( "_TF1a5outerFVs5Int64S0_" )
>                      AT_name( "outer" )
>                      AT_decl_file( "/swift/test/DebugInfo/nested_functions.swift" )
>                      AT_decl_line( 5 )
>                      AT_type( {0x000000e0} ( Int64 ) )
>                      AT_external( true )
> 
> 0x00000076:             TAG_formal_parameter [5]  
>                          AT_location( fbreg -8 )
>                          AT_name( "a" )
>                          AT_decl_file( "/swift/test/DebugInfo/nested_functions.swift" )
>                          AT_decl_line( 5 )
>                          AT_type( {0x000000e0} ( Int64 ) )
> 
> 0x00000084:             TAG_subprogram [4] *
>                          AT_low_pc( 0x0000000000000040 )
>                          AT_high_pc( 0x00000023 )
>                          AT_frame_base( rbp )
>                          AT_linkage_name( "_TFF1a5outerFVs5Int64S0_L_5innerfS0_S0_" )
>                          AT_name( "inner" )
>                          AT_decl_file( "/swift/test/DebugInfo/nested_functions.swift" )
>                          AT_decl_line( 12 )
>                          AT_type( {0x000000e0} ( Int64 ) )
>                          AT_external( true )
> 
> 0x000000a1:                 TAG_formal_parameter [5]  
>                              AT_location( fbreg -8 )
>                              AT_name( "b" )
>                              AT_decl_file( "/swift/test/DebugInfo/nested_functions.swift" )
>                              AT_decl_line( 12 )
>                              AT_type( {0x000000e0} ( Int64 ) )
> ...
> 
> 
> Interesting choice given the history/precedent of GCC & Clang's output (& thus what GDB consumes) for nested functions like lambdas and blocks where the definition is certainly nested, but it's not described that way in DWARF, but certainly closer to the source.
> 
> Fair enough, then. Guess the general "all ranges are subsets of parent ranges" doesn't quite hold & needs a bunch of special handling for subprograms - looking more like Greg's original implementation.

I am making a hybrid that I think you'll like better that will work for all cases. I also am adding checking if two lexical blocks or inline subroutines overlap.

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20170504/8392a1eb/attachment.html>


More information about the llvm-commits mailing list