[lldb-dev] LTO debug info

Philippe Lavoie via lldb-dev lldb-dev at lists.llvm.org
Wed May 11 13:27:10 PDT 2016


Thanks for the explanation and the patch.

I tried it and the exception is gone.
The debug info also seems complete. As complete as can be for an LTO build...

I attached a modified patch with the calls to ExtractDIEsIfNeeded() on separate threads.

Should I submit it for review ?

-Philippe

________________________________
From: Greg Clayton [gclayton at apple.com]
Sent: Tuesday, May 03, 2016 5:02 PM
To: Philippe Lavoie
Cc: lldb-dev at lists.llvm.org
Subject: Re: [lldb-dev] LTO debug info


> On May 3, 2016, at 6:35 AM, Philippe Lavoie <philippe.lavoie at octasic.com> wrote:
>
> The code accessing another CU while indexing is in DWARFCompileUnit::GetDIE.
> If the DIE is not in the current CU, it finds the CU that has it and calls its ::GetDIE method, accessing its m_die_array data...
>
> So the question is: is it a producer (CU should be self-contained) or a consumer (the indexing) bug ?

The indexing needs to be fixed. We probably need to start checking the DW_FORM of an attribute before we try to get the DIE for it. So the problem seems to be in DWARFCompileUnit::IndexPrivate() where when the tag is a DW_TAG_subprogram it eventually does:

if (specification_die_form.IsValid())
{
    DWARFDIE specification_die = dwarf_cu->GetSymbolFileDWARF()->DebugInfo()->GetDIE (DIERef(specification_die_form));
    if (specification_die.GetParent().IsStructOrClass())
        is_method = true;
}

This is the only call to GetDIE in the entire function. It is getting the DIE so it can look to see who the parent it to see if the parent of the function is a struct/class, which means it is a method, or anything else and the function is just the basename of a raw function. So it seems like this should be OK to access other DIEs as it isn't adding any information from the other DWARF file, it is just using it to check who the parent of function (DW_TAG_subprogram) is.

So the real problem is that "ClearDIEs()" is being called too soon before everyone is done using the DIEs. The reason this was being done is that you might have 10000 compile units and we want to partially parse the the DWARF and only use what we need, but if we index everything we want to load the DIEs for a compile unit and clear them after indexing if they weren't loaded before we did the index so we don't take up loads of memory.

So what we need to do to fix this is:

1 - Run through all of the compile units first and see if each CU has the DIEs parsed and remember this
2 - Index all compile units
3 - Clear any DIEs in any compile units that didn't have their DIEs parsed _after_ all compile units have been indexed

A proposed patch is attached:



the main issue is now all DWARF is parsed on the current thread. So this patch should be updated to run all of the ExtractDIEsIfNeeded() functions on separate threads since this can be time consuming:

        //----------------------------------------------------------------------
        // First figure out which compile units didn't have their DIEs already
        // parsed and remember this.  If no DIEs were parsed prior to this index
        // function call, we are going to want to clear the CU dies after we
        // are done indexing to make sure we don't pull in all DWARF dies, but
        // we need to wait until all compile units have been indexed in case
        // a DIE in one compile unit refers to another and the indexes accesses
        // those DIEs.
        //----------------------------------------------------------------------
        for (uint32_t cu_idx = 0; cu_idx < num_compile_units; ++cu_idx)
        {
            DWARFCompileUnit* dwarf_cu = debug_info->GetCompileUnitAtIndex(cu_idx);
            if (dwarf_cu)
            {
                // dwarf_cu->ExtractDIEsIfNeeded(false) will return zero if the
                // DIEs for a compile unit have already been parsed.
                clear_cu_dies[cu_idx] = dwarf_cu->ExtractDIEsIfNeeded(false) > 1;
            }
        }

>
> //----------------------------------------------------------------------
> // GetDIE()
> //
> // Get the DIE (Debug Information Entry) with the specified offset by
> // first checking if the DIE is contained within this compile unit and
> // grabbing the DIE from this compile unit. Otherwise we grab the DIE
> // from the DWARF file.
> //----------------------------------------------------------------------
> DWARFDIE
> DWARFCompileUnit::GetDIE (dw_offset_t die_offset)
> {
>    if (die_offset != DW_INVALID_OFFSET)
>    {
>        if (m_dwo_symbol_file)
>            return m_dwo_symbol_file->GetCompileUnit()->GetDIE(die_offset);
>
>        if (ContainsDIEOffset(die_offset))
>        {
>            ExtractDIEsIfNeeded (false);
>            DWARFDebugInfoEntry::iterator end = m_die_array.end();
>            DWARFDebugInfoEntry::iterator pos = lower_bound(m_die_array.begin(), end, die_offset, CompareDIEOffset);
>            if (pos != end)
>            {
>                if (die_offset == (*pos).GetOffset())
>                    return DWARFDIE(this, &(*pos));
>            }
>        }
>        else
>        {
>            // Don't specify the compile unit offset as we don't know it because the DIE belongs to
>            // a different compile unit in the same symbol file.
>            return m_dwarf2Data->DebugInfo()->GetDIEForDIEOffset(die_offset);
>        }
>    }
>    return DWARFDIE(); // Not found
> }
>
> ________________________________________
> From: lldb-dev [lldb-dev-bounces at lists.llvm.org] on behalf of Philippe Lavoie via lldb-dev [lldb-dev at lists.llvm.org]
> Sent: Friday, April 29, 2016 3:27 PM
> To: Greg Clayton
> Cc: lldb-dev at lists.llvm.org
> Subject: Re: [lldb-dev] LTO debug info
>
>> So the main question is why is anything that is indexing the current CU only, accessing anything from another CU? It can't and shouldn't. That is the bug.
>
> Indexing is accessing another CU because there is a reference (offset) to another CU.
>
> For example, in the 2nd compile unit below, the DW_AT_type of DW_TAG_formal_parameter at offset 0x1772 is referencing the 1st compile unit (offset 0x1741).
>
> Compilation Unit @ offset 0x16f9:
>   Length:        0x4c (32-bit)
>   Version:       4
>   Abbrev Offset: 0x763
>   Pointer Size:  4
> <0><1704>: Abbrev Number: 1 (DW_TAG_compile_unit)
>    <1705>   DW_AT_producer    : (indirect string, offset: 0x0): clang version 3.9.0 (trunk)
>    <1709>   DW_AT_language    : 12     (ANSI C99)
>    <170b>   DW_AT_name        : (indirect string, offset: 0x4be): main.c
>    <170f>   DW_AT_stmt_list   : 0x1329
>    <1713>   DW_AT_comp_dir    : (indirect string, offset: 0x4c5): C:\Users\phlav\Documents\opus studio 2013\Projects\OpusProject9
>    <1717>   Unknown AT value: 3fe1: 1
>    <1717>   DW_AT_low_pc      : 0x400000
>    <171b>   DW_AT_high_pc     : 0x8
> <1><171f>: Abbrev Number: 2 (DW_TAG_subprogram)
>    <1720>   DW_AT_low_pc      : 0x400000
>    <1724>   DW_AT_high_pc     : 0x8
>    <1728>   Unknown AT value: 3fe7: 1
>    <1728>   DW_AT_frame_base  : 1 byte block: 51       (DW_OP_reg1 (r1))
>    <172a>   DW_AT_name        : (indirect string, offset: 0xab): main
>    <172e>   DW_AT_decl_file   : 1
>    <172f>   DW_AT_decl_line   : 4
>    <1730>   DW_AT_type        : <0x1741>
>    <1734>   DW_AT_external    : 1
>    <1734>   Unknown AT value: 3fe1: 1
> <2><1734>: Abbrev Number: 3 (DW_TAG_variable)
>    <1735>   DW_AT_const_value : 55
>    <1736>   DW_AT_name        : (indirect string, offset: 0xd9): x
>    <173a>   DW_AT_decl_file   : 1
>    <173b>   DW_AT_decl_line   : 7
>    <173c>   DW_AT_type        : <0x1741>
> <1><1741>: Abbrev Number: 4 (DW_TAG_base_type)
>    <1742>   DW_AT_name        : (indirect string, offset: 0x8f): int
>    <1746>   DW_AT_encoding    : 5      (signed)
>    <1747>   DW_AT_byte_size   : 4
>
>  Compilation Unit @ offset 0x1749:
>   Length:        0x32 (32-bit)
>   Version:       4
>   Abbrev Offset: 0x763
>   Pointer Size:  4
> <0><1754>: Abbrev Number: 5 (DW_TAG_compile_unit)
>    <1755>   DW_AT_producer    : (indirect string, offset: 0x0): clang version 3.9.0 (trunk)
>    <1759>   DW_AT_language    : 12     (ANSI C99)
>    <175b>   DW_AT_name        : (indirect string, offset: 0x505): Source1.c
>    <175f>   DW_AT_stmt_list   : 0x1362
>    <1763>   DW_AT_comp_dir    : (indirect string, offset: 0x50f): C:\Users\phlav\Documents\opus studio 2013\Projects\OpusLibrary
>    <1767>   Unknown AT value: 3fe1: 1
> <1><1767>: Abbrev Number: 6 (DW_TAG_subprogram)
>    <1768>   DW_AT_name        : (indirect string, offset: 0x54e): lib
>    <176c>   DW_AT_decl_file   : 1
>    <176d>   DW_AT_decl_line   : 3
>    <176e>   DW_AT_prototyped  : 1
>    <176e>   DW_AT_type        : <0x1741>
>    <1772>   DW_AT_external    : 1
>    <1772>   Unknown AT value: 3fe1: 1
> <2><1772>: Abbrev Number: 7 (DW_TAG_formal_parameter)
>    <1773>   DW_AT_name        : (indirect string, offset: 0xd9): x
>    <1777>   DW_AT_decl_file   : 1
>    <1778>   DW_AT_decl_line   : 3
>    <1779>   DW_AT_type        : <0x1741>
>
>
>
> ________________________________________
> From: Greg Clayton [gclayton at apple.com]
> Sent: Friday, April 29, 2016 2:02 PM
> To: Philippe Lavoie
> Cc: lldb-dev at lists.llvm.org
> Subject: Re: [lldb-dev] LTO debug info
>
>> On Apr 28, 2016, at 8:11 AM, Philippe Lavoie <philippe.lavoie at octasic.com> wrote:
>>
>>
>> What made me suspect a data corruption issue were asserts triggered in the VC++ vector implementation when we use an LTO binary with a DEBUG lldb build on Windows.
>>
>>  For example,
>>  at source/Plugins/SymbolFile/DWARF/DWARFCompileUnit.cpp:624,
>>
>>  File: C:\Program Files (x86)\Microsoft Visual Studio 12.0\VC\INCLUDE\vector
>>  Line: 240
>>  Expression: vector iterators incompatible
>>
>> Digging deeper, the assertion is triggered by accesses to DWARFCompileUnit::m_die_array
>>
>> In the 'parser_fn' lambda in SymbolFileDWARF::Index() that indexes each compile unit in parallel, DWARFCompileUnit::ExtractDIEsIfNeeded(..) , DWARFCompileUnit::Index(..) and DWARFCompileUnit::ClearDIEs() are called in sequence.
>>
>> 'ExtractDIEsIfNeeded' and 'ClearDIEs' respectively populates and clears DWARFCompileUnit::m_die_array.
>>
>> But when DWARFCompileUnit::Index(..) looks up a DIE in another CU and that CU is running 'ExtractDIEsIfNeeded' or 'ClearDIEs', it will access a stale or invalid DWARFCompileUnit::m_die_array ...
>
> So the main question is why is anything that is indexing the current CU only, accessing anything from another CU? It can't and shouldn't. That is the bug.
>
>> -Philippe
>>
>> ________________________________________
>> From: Greg Clayton [gclayton at apple.com]
>> Sent: Monday, April 25, 2016 7:59 PM
>> To: Philippe Lavoie
>> Cc: lldb-dev at lists.llvm.org
>> Subject: Re: [lldb-dev] LTO debug info
>>
>>> On Apr 25, 2016, at 1:37 PM, Philippe Lavoie via lldb-dev <lldb-dev at lists.llvm.org> wrote:
>>>
>>> Hello,
>>>
>>> We noticed an issue with dwarf parsing when debugging a large application compiled with LTO.
>>>
>>> Since an LTO application's debug info can have inter-compile-unit references, (for example: a type defined in one compile-unit and used in another) we noticed that the "m_type_index" vector in SymbolFileDWARF can get corrupted. This is because many index structures in SymbolFileDWARF are generated in parallel.
>>>
>>> Is is a bug or is it expected to be the compiler's/linker's job to generate self-contained DWARF data without inter-CU references?
>>>
>>> -Philippe
>>
>> TEach SymbolFileDWARF should generate its own accelerator tables in a self contained way. No SymbolFileDWARF should be adding references to their accelerator maps that actually refer to DIEs from another SymbolFileDWARF. We might need to look at the indexing function that make sure if they see and DW_FORM_ref_addr attributes, that they don't add any of them to their index. That seems like the bug?
>>
>> Greg
>>
>
> _______________________________________________
> lldb-dev mailing list
> lldb-dev at lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-dev

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/lldb-dev/attachments/20160511/e421f2b6/attachment-0001.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: dwarf.patch
Type: application/octet-stream
Size: 5772 bytes
Desc: dwarf.patch
URL: <http://lists.llvm.org/pipermail/lldb-dev/attachments/20160511/e421f2b6/attachment-0001.obj>


More information about the lldb-dev mailing list