<html><head><meta http-equiv="Content-Type" content="text/html; charset=utf-8"></head><body style="word-wrap: break-word; -webkit-nbsp-mode: space; line-break: after-white-space;" class=""><div class="">Thanks for your replies!</div><br class=""><blockquote type="cite" class=""><blockquote type="cite" class=""><div class=""><div class="">Right now we assume that each .o file has only on CU so we don't need to open all .o files in SymbolFileDWARFDebugMap::CalculateAbilities() which is something that gets run when we are trying to figure out which SymbolFile plug-in to load. [...] The problem used to be that even if we had a dSYM file, the loop that selected the symbol file plug-in would give each and every symbol file plugin a chance ...</div></div></blockquote></blockquote><div class=""><br class=""></div><div class="">Ok makes total sense, kind of historical reason.</div><div class=""><br class=""></div><blockquote type="cite" class=""><blockquote type="cite" class=""><div class=""><div class="">But that being said, in the past I re-ordered how the SymbolFile plug-ins were initialized to always put SymbolFileDWARF first and if it finds DWARF debug info or a dSYM file and has all the abilities then we stop looking for symbol file plug-ins that can load the current file. [...] So as soon as a SymbolFile plug-in can do everything we now stop.</div></div></blockquote></blockquote><div class=""><br class=""></div>Good to know!<div class=""><br class=""></div><div class=""><blockquote type="cite" class=""><div class="">Note that counting the number of compile units can be done extremely cheaply (of course, except the cost of opening the file). Each unit as a header that contain its length (which gives you the next unit). I’m not sure we have a primitive that does this without parsing the DWARF, but it should be easy to add.</div></blockquote></div><div class=""><br class=""></div><div class="">Right, so we only need to parse the CU headers. That should be fast.</div><div class="">Opening each candidate .o file for the relatively rare case of having multiple CUs still sounds expensive, assuming that “thousands of .o files” may actually happen.</div><div class=""><br class=""></div><div class="">CalculateAbilities() does indeed call GetNumCompileUnits(), but what it really wants to know at this time is “do we have any CU in there”:</div><div class=""><br class="">uint32_t SymbolFileDWARFDebugMap::CalculateAbilities() {</div><div class=""> ...<br class=""> const uint32_t oso_index_count = GetNumCompileUnits();<br class=""> if (oso_index_count > 0) {</div> InitOSO();<br class=""> if (!m_compile_unit_infos.empty()) {<br class=""> return SymbolFile::CompileUnits | SymbolFile::Functions | …;<div class=""> }</div><div class="">}</div><div class=""><br class=""></div><div class="">As far as I can tell, we need the actual number of CUs only after we discovered plugins. In my case it’s during breakpoint resolution (i.e. BreakpointResolverFileLine::SearchCallback()). If we separated these two concerns conceptually (into HasCompileUnits() and GetNumCompileUnits()), couldn’t we then also do InitOSO() in two steps? Especially since lazy init is used everywhere already. This would avoid impact on CalculateAbilities() entirely.</div><div class=""><br class=""></div><div class="">That said, I don’t really know how big the change would get then. And it probably adds complexity, while the implementation is quite complex already.</div><div class="">Anyway, for now what do you think about the idea?</div><div class=""><br class=""></div><div class=""><br class=""></div><div class=""><blockquote type="cite" class=""><div class="">Do we really need this CU <-> Symbol mapping?</div></blockquote><br class=""></div><div class="">It’s used in SymbolFileDWARFDebugMap::SymbolContainsSymbolWithIndex(), which looks like dead code.</div><div class="">It’s also used in SymbolFileDWARFDebugMap::CompileUnitInfo::GetFileRangeMap(), which initialises the OSO range map once. In order to do that it iterates over all CUs, so changing this or adding a special case here seems possible.</div><div class=""><br class=""></div><div class=""><a href="https://github.com/llvm-mirror/lldb/blob/59608853be9b52d3c01609196d152b3e3dbb4dac/source/Plugins/SymbolFile/DWARF/SymbolFileDWARFDebugMap.cpp#L172" class="">https://github.com/llvm-mirror/lldb/blob/59608853be9b52d3c01609196d152b3e3dbb4dac/source/Plugins/SymbolFile/DWARF/SymbolFileDWARFDebugMap.cpp#L172</a></div><div class=""><br class=""></div><div class="">What do you think?</div><div class=""><br class=""></div><div class="">Best</div><div class="">Stefan</div><div><br class=""><blockquote type="cite" class=""><div class="">On 11. Sep 2018, at 18:03, Frédéric Riss <<a href="mailto:friss@apple.com" class="">friss@apple.com</a>> wrote:</div><div class=""><br class="Apple-interchange-newline"><br class="" style="caret-color: rgb(0, 0, 0); font-family: Helvetica; font-size: 12px; font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; word-spacing: 0px; -webkit-text-stroke-width: 0px; text-decoration: none;"><blockquote type="cite" class="" style="font-family: Helvetica; font-size: 12px; font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; orphans: auto; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; widows: auto; word-spacing: 0px; -webkit-text-size-adjust: auto; -webkit-text-stroke-width: 0px; text-decoration: none;"><div class="">On Sep 11, 2018, at 8:48 AM, Greg Clayton <<a href="mailto:clayborg@gmail.com" class="">clayborg@gmail.com</a>> wrote:</div><br class="Apple-interchange-newline"><div class=""><div class="" style="caret-color: rgb(0, 0, 0); font-family: Helvetica; font-size: 12px; font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; word-spacing: 0px; -webkit-text-stroke-width: 0px; text-decoration: none;"><blockquote type="cite" class=""><div class=""><br class="Apple-interchange-newline">On Sep 11, 2018, at 2:55 AM, Stefan Gränitz <<a href="mailto:sgraenitz@apple.com" class="">sgraenitz@apple.com</a>> wrote:</div><br class="Apple-interchange-newline"><div class=""><div class="" style="word-wrap: break-word; -webkit-nbsp-mode: space; line-break: after-white-space;">Hello everyone<div class=""><br class=""></div><div class="">I am investigating a bug that prevents correct breakpoint resolution in LTO objects with embedded DWARF (no separate dSYM file) and tracked it down to the initialization of SymbolFileDWARFDebugMap. This code seems to assume there is only one compile unit per object file, but LTO objects have more than that:</div><div class=""><div class=""><br class=""></div><div class=""><font face="Menlo" class="" style="font-size: 11px;">void SymbolFileDWARFDebugMap::InitOSO() {</font></div><div class=""><font face="Menlo" class="" style="font-size: 11px;">...</font></div><div class=""><font face="Menlo" class="" style="font-size: 11px;"> const uint32_t oso_index_count =</font></div><div class=""><font face="Menlo" class="" style="font-size: 11px;"> symtab->AppendSymbolIndexesWithTypeAndFlagsValue(</font></div><div class=""><font face="Menlo" class="" style="font-size: 11px;"> eSymbolTypeObjectFile, k_oso_symbol_flags_value, oso_indexes);</font></div><div class=""><font face="Menlo" class="" style="font-size: 11px;">...</font></div><div class=""><font face="Menlo" class=""><span class="" style="font-size: 11px;"> m_compile_unit_infos.resize(oso_index_count); // <—— one CU per OSO entry in the Symtab</span></font></div><div class=""><font face="Menlo" class="" style="font-size: 11px;"><br class=""></font></div><div class=""><font face="Menlo" class="" style="font-size: 11px;"> for (uint32_t i = 0; i < oso_index_count; ++i) {</font></div><div class=""><font face="Menlo" class="" style="font-size: 11px;"> const uint32_t so_idx = oso_indexes[i] - 1;</font></div><div class=""><font face="Menlo" class="" style="font-size: 11px;"> const uint32_t oso_idx = oso_indexes[i];</font></div><div class=""><font face="Menlo" class="" style="font-size: 11px;"> const Symbol *so_symbol = symtab->SymbolAtIndex(so_idx);</font></div><div class=""><font face="Menlo" class="" style="font-size: 11px;"> const Symbol *oso_symbol = symtab->SymbolAtIndex(oso_idx);</font></div><div class=""><font face="Menlo" class="" style="font-size: 11px;">...</font></div><div class=""><font face="Menlo" class="" style="font-size: 11px;"> const Symbol *last_symbol = symtab->SymbolAtIndex(sibling_idx - 1);</font></div><div class=""><font face="Menlo" class="" style="font-size: 11px;"> m_compile_unit_infos[i].first_symbol_index = so_idx;</font></div><div class=""><font face="Menlo" class="" style="font-size: 11px;"> m_compile_unit_infos[i].last_symbol_index = sibling_idx - 1;</font></div><div class=""><font face="Menlo" class="" style="font-size: 11px;"> m_compile_unit_infos[i].first_symbol_id = so_symbol->GetID();</font></div><div class=""><font face="Menlo" class="" style="font-size: 11px;"> m_compile_unit_infos[i].last_symbol_id = last_symbol->GetID();</font></div><div class=""><br class=""></div></div><div class="">The symptom is that LLDB will only read debug_line for one CU and miss all the rest. Thus, breakpoints in other CUs can’t be associated with line information.</div><div class=""><br class=""></div><div class="">I wonder if there is a good way to populate the correct number of compile units from the OSO entry at this point?</div></div></div></blockquote><div class=""><br class=""></div>The reason it is like this is we don't want to have to open all .o files when we parse the debug map in order to figure out a compile unit index. Right now the compile unit UserID is just the index of the .o file in the debug map. Opening thousands of .o files can impose a performance issue.<br class=""><blockquote type="cite" class=""><div class=""><div class="" style="word-wrap: break-word; -webkit-nbsp-mode: space; line-break: after-white-space;"><div class=""><br class=""></div><div class="">The situation may appear similar to an archive file with a number of objects, but then we have separate OSO entries like “path/to/lib/libLLVMMCParser.a(AsmLexer.cpp.o)”. Furthermore LTO objects have one common symtab for all compile units and it was probably mixed up by optimization, so we cannot simply say that CUs start/end at certain symbol indexes as in the above code. The latter is used rarely and only in SymbolFileDWARFDebugMap, so there may be a workaround, but first I have to figure out the initial question:</div><div class=""><br class=""></div><div class="">How to get more information about compile units in an LTO object? Any ideas welcome!</div></div></div></blockquote><div class=""><br class=""></div>The only way is to open each .o file and see how many compile units they contain. Right now we assume that each .o file has only on CU so we don't need to open all .o files in SymbolFileDWARFDebugMap::CalculateAbilities() which is something that gets run when we are trying to figure out which SymbolFile plug-in to load. But that being said, in the past I re-ordered how the SymbolFile plug-ins were initialized to always put SymbolFileDWARF first and if it finds DWARF debug info or a dSYM file and has all the abilities then we stop looking for symbol file plug-ins that can load the current file. The problem used to be that even if we had a dSYM file, the loop that selected the symbol file plug-in would give each and every symbol file plugin a chance to tell us how much info they could extract via a call to SymbolFile::CalculateAbilities() and that would cause us to open all .o files just to say "I parse all debug info" just like a previous plug-in could. So as soon as a SymbolFile plug-in can do everything we now stop.</div><div class="" style="caret-color: rgb(0, 0, 0); font-family: Helvetica; font-size: 12px; font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; word-spacing: 0px; -webkit-text-stroke-width: 0px; text-decoration: none;"><br class=""></div><div class="" style="caret-color: rgb(0, 0, 0); font-family: Helvetica; font-size: 12px; font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; word-spacing: 0px; -webkit-text-stroke-width: 0px; text-decoration: none;"><blockquote type="cite" class=""><div class=""><div class="" style="word-wrap: break-word; -webkit-nbsp-mode: space; line-break: after-white-space;"><div class="">If that’s not possible, I may find another way to fix it further down the road, but then the name m_compile_unit_infos seems not exactly correct here. It’s rather something like m_referenced_object_infos, right?</div></div></div></blockquote><div class=""><br class=""></div><div class="">So now that that change has been in for a while, it might be ok to open each .o file and see how many compile units they contain and then populate m_compile_unit_infos as needed.</div></div></div></blockquote><div style="caret-color: rgb(0, 0, 0); font-family: Helvetica; font-size: 12px; font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; word-spacing: 0px; -webkit-text-stroke-width: 0px; text-decoration: none;" class=""><br class=""></div><div style="caret-color: rgb(0, 0, 0); font-family: Helvetica; font-size: 12px; font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; word-spacing: 0px; -webkit-text-stroke-width: 0px; text-decoration: none;" class="">Note that counting the number of compile units can be done extremely cheaply (of course, except the cost of opening the file). Each unit as a header that contain its length (which gives you the next unit). I’m not sure we have a primitive that does this without parsing the DWARF, but it should be easy to add. </div><br class="" style="caret-color: rgb(0, 0, 0); font-family: Helvetica; font-size: 12px; font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; word-spacing: 0px; -webkit-text-stroke-width: 0px; text-decoration: none;"><blockquote type="cite" class="" style="font-family: Helvetica; font-size: 12px; font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; orphans: auto; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; widows: auto; word-spacing: 0px; -webkit-text-size-adjust: auto; -webkit-text-stroke-width: 0px; text-decoration: none;"><div class=""><div class="" style="caret-color: rgb(0, 0, 0); font-family: Helvetica; font-size: 12px; font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; word-spacing: 0px; -webkit-text-stroke-width: 0px; text-decoration: none;"><div class="">You will need to watch for any usage of m_compile_unit_infos and make sure it does the correct thing.</div></div></div></blockquote><div style="caret-color: rgb(0, 0, 0); font-family: Helvetica; font-size: 12px; font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; word-spacing: 0px; -webkit-text-stroke-width: 0px; text-decoration: none;" class=""><br class=""></div><div style="caret-color: rgb(0, 0, 0); font-family: Helvetica; font-size: 12px; font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; word-spacing: 0px; -webkit-text-stroke-width: 0px; text-decoration: none;" class="">That’s the part I was worried about. The structure of m_compile_unit_infos makes the assumption that we can associate slices of symbols to a compile unit. I don’t think this is a correct assumption to make in the LTO case, and even if it were, we’d need to parse the DWARF and do some pretty heavy processing to extract the information. Do we really need this CU <-> Symbol mapping?</div><div style="caret-color: rgb(0, 0, 0); font-family: Helvetica; font-size: 12px; font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; word-spacing: 0px; -webkit-text-stroke-width: 0px; text-decoration: none;" class=""><br class=""></div><div style="caret-color: rgb(0, 0, 0); font-family: Helvetica; font-size: 12px; font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; word-spacing: 0px; -webkit-text-stroke-width: 0px; text-decoration: none;" class="">Fred </div><br class="" style="caret-color: rgb(0, 0, 0); font-family: Helvetica; font-size: 12px; font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; word-spacing: 0px; -webkit-text-stroke-width: 0px; text-decoration: none;"><blockquote type="cite" class="" style="font-family: Helvetica; font-size: 12px; font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; orphans: auto; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; widows: auto; word-spacing: 0px; -webkit-text-size-adjust: auto; -webkit-text-stroke-width: 0px; text-decoration: none;"><div class=""><div class="" style="caret-color: rgb(0, 0, 0); font-family: Helvetica; font-size: 12px; font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; word-spacing: 0px; -webkit-text-stroke-width: 0px; text-decoration: none;"><div class=""><blockquote type="cite" class=""><div class="" style="word-wrap: break-word; -webkit-nbsp-mode: space; line-break: after-white-space;"></div></blockquote></div><br class=""><blockquote type="cite" class=""><div class=""><div class="" style="word-wrap: break-word; -webkit-nbsp-mode: space; line-break: after-white-space;"><div class="">Btw.: My first attempt was a workaround for the symptom (see<span class="Apple-converted-space"> </span><a href="https://reviews.llvm.org/D51546" class="">https://reviews.llvm.org/D51546</a>). It simply reads all debug_lines for a single CU, but I’d really appreciate a better solution.</div></div></div></blockquote><div class=""><br class=""></div><div class="">The fix in D51546 seems wrong because the only way we get to a line table is via the DW_AT_stmt_list from a compile unit. If we can fix the LTO case to load all compile units from the LTO.o files with multiple CU's this fix won't be needed.</div><div class=""><br class=""></div><div class="">So the correct solution is to detect how many compile units are in each .o file and then make sure to find all places that were assuming anything about the OSO index being the compile unit UserID are fixed. Now that plug-in loading stops after a SymbolFile says it can handle everything we can probably do a bit more work. One issue is that .o files might have been cleaned up or removed, so be sure to test any solution by removing the .o files and seeing how we do.</div><div class=""><br class=""></div><div class="">I will be happy to review any patch you have. I can't think of any other reason the the OSO index needs to be the compile unit index. IF you do make a patch, please remove any functions in SymbolFileDWARFDebugMap that are dead code. SymbolFileDWARFDebugMap::GetModuleByOSOIndex() seems to be dead. If that is dead thenvSymbolFileDWARFDebugMap::GetObjectFileByOSOIndex() seems to be dead also.</div></div></div></blockquote></div></blockquote></div><br class=""></body></html>