[PATCH] D28040: Don't store the NULL DIEs in the DIE array in the DWARFUnit.

David Blaikie via llvm-commits llvm-commits at lists.llvm.org
Thu Dec 22 09:59:45 PST 2016


On Thu, Dec 22, 2016 at 9:04 AM Greg Clayton via Phabricator <
reviews at reviews.llvm.org> wrote:

> clayborg added a comment.
>
> Added inline comments and responses. New patch coming with fixes.
>
>
>
> ================
> Comment at: include/llvm/DebugInfo/DWARF/DWARFDebugInfoEntry.h:39
> +  /// NULL DIEs are now removed when storing DWARFDebugInfoEntry
> +  uint32_t HasNullSibling : 1;
>
> ----------------
> dblaikie wrote:
> > This should be a bool.
> If you change the type it won't be part of the uint32_t for
> HasNullSibling. I compiled a quick example:
>
> ```
> #include <stdint.h>
>
> struct Test {
>   uint32_t Depth : 31;
>   bool IsLastChild : 1;
> };
>
> int main (int argc, char const *argv[])
> {
>   Test t = { 123, false };
>   if (t.Depth)
>     return 0;
>   return 1;
> }
> ```
>
> And I got the following DWARF:
> ```
> 0x000000b0:     TAG_structure_type [8] *
>                  AT_name( "Test" )
>                  AT_byte_size( 0x04 )
>                  AT_decl_file( "/private/tmp/main.c" )
>                  AT_decl_line( 3 )
>
> 0x000000b8:         TAG_member [9]
>                      AT_name( "Depth" )
>                      AT_type( {0x00000000000000e3} ( uint32_t ) )
>                      AT_decl_file( "/private/tmp/main.c" )
>                      AT_decl_line( 4 )
>                      AT_byte_size( 0x04 )
>                      AT_bit_size( 0x1f )
>                      AT_bit_offset( 0x01 )
>                      AT_data_member_location( +0 )
>
> 0x000000cd:         TAG_member [9]
>                      AT_name( "IsLastChild" )
>                      AT_type( {0x00000000000000f9} ( bool ) )
>                      AT_decl_file( "/private/tmp/main.c" )
>                      AT_decl_line( 5 )
>                      AT_byte_size( 0x01 )
>                      AT_bit_size( 0x01 )
>                      AT_bit_offset( 0x00 )
>                      AT_data_member_location( +3 )
>
> 0x000000e2:         NULL
> ```
> Note that IsLastChild has a AT_data_member_location that is +3 so it is in
> its own block. The type has to be the same in order to share storage.


I think you might be misreading the DWARF (& it looks like the DWARF is a
bit confusing, as well).

For starters - in both your examples, the byte_size of the struct is 4
bytes. So it seems storage is shared, or at least that using the same type
doesn't improve/reduce the size of the struct - so on that basis I'd
probably suggest using the bool type.

Beyond that, though. The AT_data_member_location is +3, as you say -
not +4, so it is still using the top (or bottom - the counting here is a
bit weird imho, but seems to work out) byte of the previous 4 byte value.

Also - we have /lots/ of similar mixed type bitfields in llvm, I think
(have a grep around - it looks like it to me, at least) - so if this isn't
doing the right thing I'd be /really/ surprised.


> Changing IsLastChild to a uint32_t we see things are right:
>
> ```
> 0x000000b0:     TAG_structure_type [8] *
>                  AT_name( "Test" )
>                  AT_byte_size( 0x04 )
>                  AT_decl_file( "/private/tmp/main.c" )
>                  AT_decl_line( 3 )
>
> 0x000000b8:         TAG_member [9]
>                      AT_name( "Depth" )
>                      AT_type( {0x00000000000000e3} ( uint32_t ) )
>                      AT_decl_file( "/private/tmp/main.c" )
>                      AT_decl_line( 4 )
>                      AT_byte_size( 0x04 )
>                      AT_bit_size( 0x1f )
>                      AT_bit_offset( 0x01 )
>                      AT_data_member_location( +0 )
>
> 0x000000cd:         TAG_member [9]
>                      AT_name( "IsLastChild" )
>                      AT_type( {0x00000000000000e3} ( uint32_t ) )
>                      AT_decl_file( "/private/tmp/main.c" )
>                      AT_decl_line( 5 )
>                      AT_byte_size( 0x04 )
>                      AT_bit_size( 0x01 )
>                      AT_bit_offset( 0x00 )
>                      AT_data_member_location( +0 )
>
> 0x000000e2:         NULL
> ```
>
>
> ================
> Comment at: lib/DebugInfo/DWARF/DWARFDie.cpp:359
>    if (!isValid())
> -    return;
> +    return -1U;
>    DataExtractor debug_info_data = U->getDebugInfoExtractor();
> ----------------
> aprantl wrote:
> > Is there a better alternative to returning a magic value?
> We should probably just assert this object is valid. Clients should check
> prior to calling.
>
>
> ================
> Comment at: lib/DebugInfo/DWARF/DWARFDie.cpp:398-402
> +          WithColor(OS, syntax::Address).get() << format("\n0x%8.8x: ",
> Offset);
> +          OS.indent(Indent) << "NULL\n";
> +          // Add 1 to the offset for the ULEB128 abbreviation code byte.
> +          Offset += 1;
> +        }
> ----------------
> I believe this does belong here. A DIE could claim to have children but
> only have a NULL inside of it, then we would fail to print the NULL if we
> try to do it with children at line 392.


How would we succeed to print the NULL if we aren't printing children?
There would be no 'last child' & so we wouldn't print the NULL in that case
either.

You're right though, we wouldn't print it in 392, but perhaps we should
change the condition here from "if (Die->isLastChild())" to "if
(Die->hasChildren())" ?

Also, a test case for an empty child list would be good.


> Also, if RecurseDepth is zero, we still need to print the NULL die since
> it is at the current recurse level. Given these reasons, it seems that we
> still need the getHasNULLSibling() property and this code belongs where it
> currently is.
>
>
> ================
> Comment at: tools/dsymutil/DwarfLinker.cpp:1799
>    if (DIE.hasChildren())
> -    for (auto Child = DIE.getFirstChild(); Child && !Child.isNULL();
> -         Child = Child.getSibling())
> +    for (auto Child = DIE.getFirstChild(); Child; Child =
> Child.getSibling())
>        Info.Prune &= analyzeContextInfo(Child, MyIdx, CU,
> CurrentDeclContext,
> ----------------
> aprantl wrote:
> > Might make sense to add a method that returns something like a
> SiblingIterator at some point in the future?
> I agree. A later commit we can add a function like:
>
> ```
>   DieIteratorRange DWARFDie::Children();
> ```
>
> So we can then so:
>
> ```
> for (auto Child: DIE.Children()) {
> ```
>
>
>
> ================
> Comment at: unittests/DebugInfo/DWARF/DWARFDebugInfoTest.cpp:1043
>    EXPECT_EQ(B.getParent(), CUDie);
> -  EXPECT_EQ(Null.getParent(), CUDie);
> +  EXPECT_EQ(C.getParent(), CUDie);
>
> ----------------
> I argue this one is needed. A was the first child of CU, B was the second
> child after A had no children, and C is the third child that follows B
> which had children. I do think all are valid tests that we can't regress on.


Just to be clear - I'm never suggesting there are test cases we "can
regress on", just asking about whether the test cases pull their weight -
whether they exercise sufficiently distinct/interesting codepaths that make
them important tests compared to the existing coverage.


> This is testing the DWARF parser and making sure it they can track down
> their relations correctly. If we significantly change DWARFDebugInfoEntry
> at any point, we should verify everything still works.
>
> Knowing the current algorithm to find the parent when we find the parent
> for A the depth decreases only as you walk backwards. The depth for B does,
> but it also has A at the same depth. The depth for C will go up when it
> sees the children for B and then go back down again, so I do believe all
> these tests are valid.
>

It looks like C is perhaps a superset - it demonstrates we can go down into
children, and back up. But, yes, good point that A and B are
boundary/corner cases that are probably reasonable to test for separately.

Please add comments describing what makes each of these paths interesting.
(eg: "// verify an immediate parent", "// verify skipping preceding
siblings", "// verify skipping preceding niblings" (luls:
https://en.wiktionary.org/wiki/nibling ) - those probably aren't the
perfect comments, but a rough idea)


>
>
> ================
> Comment at: unittests/DebugInfo/DWARF/DWARFDebugInfoTest.cpp:1050
>    auto B2 = B1.getSibling();
> -  EXPECT_TRUE(B2.getSibling().isNULL());
> +  EXPECT_FALSE(B2.getSibling().isValid());
>
> ----------------
> aprantl wrote:
> > Should we introduce something like a hasSibling() method? This almost
> sounds like it has a corrupted sibling DIE.
> Now when we ask for a sibling and it doesn't have one we get an invalid
> DWARFDie back so this is the new way to check for that. I am not sure
> adding a hasSibling() method will make the code any clearer? If you wrote a
> loop using the current code, it currently looks like:
>
> ```
>   for (auto Child = Die.getFirstChild(); Child; Child = Child.getSibling())
> ```
>
> We the "Child;" test is using the operator bool. If we add the
> hasSibling() would code actually use it? Does this look clearer:
>
> ```
>   for (auto Child = Die.getFirstChild(); Child; Child = Child.hasSibling()
> ? Child.getSibling() : DWARFDie())
> ```
>
> So if we introduced it, who would actually use it? We could switch the
> code to use operator bool:
>
> ```
>   EXPECT_FALSE((bool)B2.getSibling());
> ```
>
> But I am not sure that makes things clearer.
>

*nod* don't think it's a big deal right now, and I think I'm OK (but see
what Adrian has to say) with this null DIE being a valid return value.

It'll be all easier once iterators/ranges are added (but, no, I wouldn't
add them in this commit).


>
>
> https://reviews.llvm.org/D28040
>
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20161222/dff56938/attachment-0001.html>


More information about the llvm-commits mailing list