<div dir="ltr">Ah, OK.<br><br>So this functionality was added in <a href="https://github.com/llvm/llvm-project/commit/c8ae09673969e7b179fe419d780d1d0f2d2c2c19">https://github.com/llvm/llvm-project/commit/c8ae09673969e7b179fe419d780d1d0f2d2c2c19</a> - though the comment there is a bit confusing (a true dwo file wouldn't have any skeleton units in it - it looks like, judging from the test case, what I was talking about was dumping .o files that contain both dwo and non-dwo sections (the original way dwo files were emitted was to produce a single .o file then run objcopy to split out the dwo parts - Clang doesn't do that anymore (though llc still retains the functionality), instead producing the two files separately from the start - but split-dwarf=single now recreates that situation again))<br><br>So this isn't intended to apply when navigating from a .o/linked executable to a .dwo. Reordering the contents of this function a bit - something in the realm of your second patch ( <a href="https://reviews.llvm.org/D96827">https://reviews.llvm.org/D96827</a> ) seems like the right path.<br><br>I might be inclined for it to be more like:<br><br>if (!addr base) {<br> if (!isdwo)<br> return None;<br> R = info_section_units;<br> if (!hasSingleElement(R))<br> return None;<br> return (*R.begin())->getAddrOffsetSectionItem(Index);<br>}<br><br>To reduce indentation and avoid duplicate tests.<br><br>If you could update that review with code something like that, and include a test case - I'd be happy to review it.</div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Wed, Mar 3, 2021 at 1:01 PM Alexander Yermolovich <<a href="mailto:ayermolo@fb.com">ayermolo@fb.com</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
<div dir="ltr">
<div style="font-family:Calibri,Arial,Helvetica,sans-serif;font-size:12pt;color:rgb(0,0,0);background-color:rgb(255,255,255)">
Hello David</div>
<div style="font-family:Calibri,Arial,Helvetica,sans-serif;font-size:12pt;color:rgb(0,0,0);background-color:rgb(255,255,255)">
<br>
</div>
<div style="font-family:Calibri,Arial,Helvetica,sans-serif;font-size:12pt;color:rgb(0,0,0);background-color:rgb(255,255,255)">
Thank you for the example. I was able to reproduce your results.<br>
llvm-symbolizer 0x400611 -obj=a.out
<div>f2()</div>
<div>/home/ayermolo/local/tasks/T83058825/test.cpp:7:3</div>
<div>main</div>
/home/ayermolo/local/tasks/T83058825/test.cpp:13:3<br>
<br>
I was wrong lumping .dwo files, split mode, into this. I primarily been looking at it in -gsplit-dwarf=single mode where .dwo sections are left in the .o files. <br>
In single mode in .o file there is the Skelton CU that gets relocated by linker, and the dwo sections. In -gsplit-dwarf=split mode where debug information is in the .dwo files there is only .dwo sections. I am probably repeating what you already know, but in
case others read this who are not familiar. <span id="gmail-m_-593140788157315884🙂">🙂</span><br>
<br>
Reason your example works is because <span style="color:rgb(0,0,0);background-color:rgb(255,255,255);display:inline">-gsplit-dwarf defaults to -gsplit-dwarf=split. When code gets to DWARFUnit::getAddrOffsetSectionItem just like before we
are in DWO Context/DWO CU so IsDWO is set. It tries to parse NormalUnits<br>
</span>
<div style="color:rgb(212,212,212);background-color:rgb(30,30,30);font-family:Menlo,Monaco,"Courier New",monospace;font-weight:normal;font-size:12px;line-height:18px">
<span><span style="color:rgb(86,156,214)">auto</span><span> R = </span><span style="color:rgb(156,220,254)">Context</span><span>.</span><span style="color:rgb(220,220,170)">info_section_units</span><span>();</span></span></div>
<div style="font-family:Calibri,Arial,Helvetica,sans-serif;font-size:12pt;color:rgb(0,0,0);background-color:rgb(255,255,255)">
<br>
</div>
But since we are dealing with .dwo files there is nothing there. Just dwo sections.<br>
It then goes for the, what looks like sanity check, hasSingleElement. Which returns false because NormalUnits is empty.<br>
At which point it goes to retrieve address in the DWO CU. Same path as with my exploratory changes.</div>
<div style="font-family:Calibri,Arial,Helvetica,sans-serif;font-size:12pt;color:rgb(0,0,0);background-color:rgb(255,255,255)">
<br>
</div>
<div style="font-family:Calibri,Arial,Helvetica,sans-serif;font-size:12pt;color:rgb(0,0,0);background-color:rgb(255,255,255)">
Building in single mode, keeping .o around, should reproduce:<br>
clang++ -g -gsplit-dwarf=single test.cpp -O3 -c<br>
ld.lld -out a.out test.o<br>
<br>
Sorry for the confusion.<br>
<br>
Alex<br>
<br>
</div>
<div id="gmail-m_-593140788157315884appendonsend"></div>
<hr style="display:inline-block;width:98%">
<div id="gmail-m_-593140788157315884divRplyFwdMsg" dir="ltr"><font face="Calibri, sans-serif" color="#000000" style="font-size:11pt"><b>From:</b> David Blaikie <<a href="mailto:dblaikie@gmail.com" target="_blank">dblaikie@gmail.com</a>><br>
<b>Sent:</b> Tuesday, March 2, 2021 6:06 PM<br>
<b>To:</b> Alexander Yermolovich <<a href="mailto:ayermolo@fb.com" target="_blank">ayermolo@fb.com</a>><br>
<b>Cc:</b> Pavel Labath <<a href="mailto:pavel@labath.sk" target="_blank">pavel@labath.sk</a>>; <a href="mailto:llvm-dev@lists.llvm.org" target="_blank">llvm-dev@lists.llvm.org</a> <<a href="mailto:llvm-dev@lists.llvm.org" target="_blank">llvm-dev@lists.llvm.org</a>><br>
<b>Subject:</b> Re: [llvm-dev] Extracting LocList address ranges from DWO .debug_info</font>
<div> </div>
</div>
<div>
<div dir="ltr">Could you provide more detailed repro steps - so far as I can see, llvm-symbolizer is correctly reading dwo files:<br>
<br>
<p style="margin:0px;font-variant-numeric:normal;font-variant-east-asian:normal;font-stretch:normal;font-size:11px;line-height:normal;font-family:Menlo;color:rgb(0,0,0)">
<span style="font-variant-ligatures:no-common-ligatures"><b>$ cat test.cpp</b></span></p>
<p style="margin:0px;font-variant-numeric:normal;font-variant-east-asian:normal;font-stretch:normal;font-size:11px;line-height:normal;font-family:Menlo;color:rgb(0,0,0)">
<span style="font-variant-ligatures:no-common-ligatures">__attribute__((nodebug)) __attribute__((optnone)) void f1() {</span></p>
<p style="margin:0px;font-variant-numeric:normal;font-variant-east-asian:normal;font-stretch:normal;font-size:11px;line-height:normal;font-family:Menlo;color:rgb(0,0,0)">
<span style="font-variant-ligatures:no-common-ligatures">}</span></p>
<p style="margin:0px;font-variant-numeric:normal;font-variant-east-asian:normal;font-stretch:normal;font-size:11px;line-height:normal;font-family:Menlo;color:rgb(0,0,0)">
<span style="font-variant-ligatures:no-common-ligatures">__attribute__((always_inline)) inline void f2() {</span></p>
<p style="margin:0px;font-variant-numeric:normal;font-variant-east-asian:normal;font-stretch:normal;font-size:11px;line-height:normal;font-family:Menlo;color:rgb(0,0,0)">
<span style="font-variant-ligatures:no-common-ligatures"><span>
</span>f1();</span></p>
<p style="margin:0px;font-variant-numeric:normal;font-variant-east-asian:normal;font-stretch:normal;font-size:11px;line-height:normal;font-family:Menlo;color:rgb(0,0,0)">
<span style="font-variant-ligatures:no-common-ligatures">}</span></p>
<p style="margin:0px;font-variant-numeric:normal;font-variant-east-asian:normal;font-stretch:normal;font-size:11px;line-height:normal;font-family:Menlo;color:rgb(0,0,0)">
<span style="font-variant-ligatures:no-common-ligatures">int main() {</span></p>
<p style="margin:0px;font-variant-numeric:normal;font-variant-east-asian:normal;font-stretch:normal;font-size:11px;line-height:normal;font-family:Menlo;color:rgb(0,0,0)">
<span style="font-variant-ligatures:no-common-ligatures"><span>
</span>f2();</span></p>
<p style="margin:0px;font-variant-numeric:normal;font-variant-east-asian:normal;font-stretch:normal;font-size:11px;line-height:normal;font-family:Menlo;color:rgb(0,0,0)">
<span style="font-variant-ligatures:no-common-ligatures">}</span></p>
<p style="margin:0px;font-variant-numeric:normal;font-variant-east-asian:normal;font-stretch:normal;font-size:11px;line-height:normal;font-family:Menlo;color:rgb(0,0,0)">
<span style="font-variant-ligatures:no-common-ligatures"><b>$ clang++-tot -g -gsplit-dwarf test.cpp -O3</b></span></p>
<p style="margin:0px;font-variant-numeric:normal;font-variant-east-asian:normal;font-stretch:normal;font-size:11px;line-height:normal;font-family:Menlo;color:rgb(0,0,0)">
<span style="font-variant-ligatures:no-common-ligatures"><b>$ llvm-symbolizer 0x401121 -obj=a.out</b></span></p>
<p style="margin:0px;font-variant-numeric:normal;font-variant-east-asian:normal;font-stretch:normal;font-size:11px;line-height:normal;font-family:Menlo;color:rgb(0,0,0)">
<span style="font-variant-ligatures:no-common-ligatures">f2()</span></p>
<p style="margin:0px;font-variant-numeric:normal;font-variant-east-asian:normal;font-stretch:normal;font-size:11px;line-height:normal;font-family:Menlo;color:rgb(0,0,0)">
<span style="font-variant-ligatures:no-common-ligatures">/usr/local/google/home/blaikie/dev/scratch/test.cpp:4:3</span></p>
<p style="margin:0px;font-variant-numeric:normal;font-variant-east-asian:normal;font-stretch:normal;font-size:11px;line-height:normal;font-family:Menlo;color:rgb(0,0,0)">
<span style="font-variant-ligatures:no-common-ligatures">main</span></p>
<p style="margin:0px;font-variant-numeric:normal;font-variant-east-asian:normal;font-stretch:normal;font-size:11px;line-height:normal;font-family:Menlo;color:rgb(0,0,0)">
<span style="font-variant-ligatures:no-common-ligatures">/usr/local/google/home/blaikie/dev/scratch/test.cpp:7:3</span></p>
<p style="margin:0px;font-variant-numeric:normal;font-variant-east-asian:normal;font-stretch:normal;font-size:11px;line-height:normal;font-family:Menlo;color:rgb(0,0,0);min-height:13px">
<span style="font-variant-ligatures:no-common-ligatures"></span><br>
</p>
<p style="margin:0px;font-variant-numeric:normal;font-variant-east-asian:normal;font-stretch:normal;font-size:11px;line-height:normal;font-family:Menlo;color:rgb(57,192,38)">
<span style="font-variant-ligatures:no-common-ligatures;color:rgb(0,0,0)"><b>$ rm test.dwo</b></span></p>
<p style="margin:0px;font-variant-numeric:normal;font-variant-east-asian:normal;font-stretch:normal;font-size:11px;line-height:normal;font-family:Menlo;color:rgb(0,0,0)">
<span style="font-variant-ligatures:no-common-ligatures"><b>$ llvm-symbolizer 0x401121 -obj=a.out</b></span></p>
<p style="margin:0px;font-variant-numeric:normal;font-variant-east-asian:normal;font-stretch:normal;font-size:11px;line-height:normal;font-family:Menlo;color:rgb(0,0,0)">
<span style="font-variant-ligatures:no-common-ligatures">main</span></p>
<p style="margin:0px;font-variant-numeric:normal;font-variant-east-asian:normal;font-stretch:normal;font-size:11px;line-height:normal;font-family:Menlo;color:rgb(0,0,0)">
<span style="font-variant-ligatures:no-common-ligatures">/usr/local/google/home/blaikie/dev/scratch/test.cpp:4:3</span></p>
<p style="margin:0px;font-variant-numeric:normal;font-variant-east-asian:normal;font-stretch:normal;font-size:11px;line-height:normal;font-family:Menlo;color:rgb(0,0,0);min-height:13px">
<span style="font-variant-ligatures:no-common-ligatures"></span><br>
<br>
<br>
</p>
</div>
<br>
<div>
<div dir="ltr">On Tue, Mar 2, 2021 at 5:40 PM Alexander Yermolovich <<a href="mailto:ayermolo@fb.com" target="_blank">ayermolo@fb.com</a>> wrote:<br>
</div>
<blockquote style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
<div dir="ltr">
<div style="font-family:Calibri,Arial,Helvetica,sans-serif;font-size:12pt;color:rgb(0,0,0);background-color:rgb(255,255,255)">
Hello David<br>
<br>
Thank you for the pointer. <br>
I looked at llvm-symbolizer and I think it suffers from the same problem.<br>
First output:<br>
Case 1: Monolithic debug information:</div>
<div style="font-family:Calibri,Arial,Helvetica,sans-serif;font-size:12pt;color:rgb(0,0,0);background-color:rgb(255,255,255)">
llvm-symbolizer --obj bzip2 --print-address 0x00000000004014b7
<div>0x4014b7</div>
<div>copyFileName</div>
<div>/home/ayermolo/local/bzip2_base/bzip2.c:941:3</div>
<div>main</div>
/home/ayermolo/local/bzip2_base/bzip2.c:1823:4</div>
<div style="font-family:Calibri,Arial,Helvetica,sans-serif;font-size:12pt;color:rgb(0,0,0);background-color:rgb(255,255,255)">
<br>
</div>
<div style="font-family:Calibri,Arial,Helvetica,sans-serif;font-size:12pt;color:rgb(0,0,0);background-color:rgb(255,255,255)">
Case 2: Debug fission with upstream build<br>
llvm-symbolizer --obj bzip2 --print-address 0x00000000004014b7
<div>0x4014b7</div>
<div>main</div>
/home/ayermolo/local/bzip2_DF/bzip2.c:941:3<br>
<br>
Case 3: Debug fission with changes (either one will work) proposed<br>
llvm-symbolizer --obj bzip2 --print-address 0x00000000004014b7
<div>0x4014b7</div>
<div>copyFileName</div>
<div>/home/ayermolo/local/bzip2_DF/bzip2.c:941:3</div>
<div>main</div>
/home/ayermolo/local/bzip2_DF/bzip2.c:1823:4</div>
<div style="font-family:Calibri,Arial,Helvetica,sans-serif;font-size:12pt;color:rgb(0,0,0);background-color:rgb(255,255,255)">
<br>
</div>
<div style="font-family:Calibri,Arial,Helvetica,sans-serif;font-size:12pt;color:rgb(0,0,0);background-color:rgb(255,255,255)">
For reference <br>
Debug entry in Monolithic format<br>
<p style="margin:0px;font:13px "Helvetica Neue"">
<span style="font-family:Calibri,Helvetica,sans-serif;font-size:12pt">0x00000784:
</span><span style="font-family:Calibri,Helvetica,sans-serif;font-size:12pt">
</span><span style="font-family:Calibri,Helvetica,sans-serif;font-size:12pt">DW_TAG_inlined_subroutine [29] *</span></p>
<p style="margin:0px;font:13px "Helvetica Neue"">
<span style="font-family:Calibri,Helvetica,sans-serif;font-size:12pt">
</span><span style="font-family:Calibri,Helvetica,sans-serif;font-size:12pt">DW_AT_abstract_origin [DW_FORM_ref4]</span><span style="font-family:Calibri,Helvetica,sans-serif;font-size:12pt">
</span><span style="font-family:Calibri,Helvetica,sans-serif;font-size:12pt">(cu + 0x06a9 => {0x000006a9} "copyFileName")</span></p>
<p style="margin:0px;font:13px "Helvetica Neue"">
<span style="font-family:Calibri,Helvetica,sans-serif;font-size:12pt">
</span><span style="font-family:Calibri,Helvetica,sans-serif;font-size:12pt">DW_AT_low_pc [DW_FORM_addr]</span><span style="font-family:Calibri,Helvetica,sans-serif;font-size:12pt">
</span><span style="font-family:Calibri,Helvetica,sans-serif;font-size:12pt">(0x00000000004014b7)</span></p>
<p style="margin:0px;font:13px "Helvetica Neue"">
<span style="font-family:Calibri,Helvetica,sans-serif;font-size:12pt">
</span><span style="font-family:Calibri,Helvetica,sans-serif;font-size:12pt">DW_AT_high_pc [DW_FORM_data4]</span><span style="font-family:Calibri,Helvetica,sans-serif;font-size:12pt">
</span><span style="font-family:Calibri,Helvetica,sans-serif;font-size:12pt">(0x0000001b)</span></p>
<p style="margin:0px;font:13px "Helvetica Neue"">
<span style="font-family:Calibri,Helvetica,sans-serif;font-size:12pt">
</span><span style="font-family:Calibri,Helvetica,sans-serif;font-size:12pt">DW_AT_call_file [DW_FORM_data1]</span><span style="font-family:Calibri,Helvetica,sans-serif;font-size:12pt">
</span><span style="font-family:Calibri,Helvetica,sans-serif;font-size:12pt">("/home/ayermolo/local/bzip2_base/bzip2.c")</span></p>
<p style="margin:0px;font:13px "Helvetica Neue"">
<span style="font-family:Calibri,Helvetica,sans-serif;font-size:12pt">
</span><span style="font-family:Calibri,Helvetica,sans-serif;font-size:12pt">DW_AT_call_line [DW_FORM_data2]</span><span style="font-family:Calibri,Helvetica,sans-serif;font-size:12pt">
</span><span style="font-family:Calibri,Helvetica,sans-serif;font-size:12pt">(1823)</span></p>
<p style="margin:0px;font:13px "Helvetica Neue"">
<span style="font-family:Calibri,Helvetica,sans-serif;font-size:12pt">
</span><span style="font-family:Calibri,Helvetica,sans-serif;font-size:12pt">DW_AT_call_column [DW_FORM_data1]</span><span style="font-family:Calibri,Helvetica,sans-serif;font-size:12pt">
</span><span style="font-family:Calibri,Helvetica,sans-serif;font-size:12pt">(0x04)</span></p>
<br>
<br>
Debug entry in Debug fission format<br>
0x0000052e: DW_TAG_inlined_subroutine [29] *
<div> DW_AT_abstract_origin [DW_FORM_ref4] (cu + 0x047e => {0x0000047e} "copyFileName")</div>
<div> DW_AT_low_pc [DW_FORM_GNU_addr_index] (indexed (0000001a) address = 0x00000000000000a7 ".text.main")</div>
<div> DW_AT_high_pc [DW_FORM_data4] (0x0000001b)</div>
<div> DW_AT_call_file [DW_FORM_data1] (0x01)</div>
<div> DW_AT_call_line [DW_FORM_data2] (1823)</div>
DW_AT_call_column [DW_FORM_data1] (0x04)</div>
<div style="font-family:Calibri,Arial,Helvetica,sans-serif;font-size:12pt;color:rgb(0,0,0);background-color:rgb(255,255,255)">
<br>
</div>
<div style="font-family:Calibri,Arial,Helvetica,sans-serif;font-size:12pt;color:rgb(0,0,0);background-color:rgb(255,255,255)">
<br>
<br>
To dig into APIs.</div>
<div style="font-family:Calibri,Arial,Helvetica,sans-serif;font-size:12pt;color:rgb(0,0,0);background-color:rgb(255,255,255)">
<br>
SymbolizableObjectFile::symbolizeInlinedCode → DWARFContext::getInliningInfoForAddress → DWARFUnit::getInlinedChainForAddress → DWARFUnit::parseDWO</div>
<div style="font-family:Calibri,Arial,Helvetica,sans-serif;font-size:12pt;color:rgb(0,0,0);background-color:rgb(255,255,255)">
<br>
</div>
<div style="font-family:Calibri,Arial,Helvetica,sans-serif;font-size:12pt;color:rgb(0,0,0);background-color:rgb(255,255,255)">
At which point DWO Context is created, DWO CU is created and DWO field is set in Skeleton CU.<br>
<br>
<span style="font-family:Calibri,Helvetica,sans-serif">By comparison this is how I get DWO CU:</span><br>
<span style="font-family:Calibri,Helvetica,sans-serif">DWARFUnit::</span><code><span style="font-family:Calibri,Helvetica,sans-serif">getNonSkeletonUnitDIE --> DWARFUnit::parseDWO()</span></code></div>
<div style="font-size:12pt;color:rgb(0,0,0);background-color:rgb(255,255,255)">
<font face="monospace"><br>
</font></div>
<div style="font-size:12pt;color:rgb(0,0,0);background-color:rgb(255,255,255)">
<font face="monospace"><span style="font-family:Calibri,Helvetica,sans-serif">After parseDWO a DWARFUnit::getSubroutineForAddress is called on DWO CU (since we are dealing with debug fission).</span><span style="font-family:Calibri,Helvetica,sans-serif"> </span></font></div>
<div style="font-size:12pt;color:rgb(0,0,0);background-color:rgb(255,255,255)">
<font face="monospace"><span style="font-family:Calibri,Helvetica,sans-serif">
<div style="color:rgb(212,212,212);background-color:rgb(30,30,30);font-family:Menlo,Monaco,"Courier New",monospace;font-weight:normal;font-size:12px;line-height:18px">
<span><span>DWARFDie SubroutineDIE =</span></span><br>
<span><span> (DWO ? *DWO : *</span><span style="color:rgb(86,156,214)">this</span><span>).</span><span style="color:rgb(220,220,170)">getSubroutineForAddress</span><span>(Address);</span></span></div>
<br>
</span></font></div>
<div style="font-size:12pt;color:rgb(0,0,0);background-color:rgb(255,255,255)">
<font face="monospace"><span style="font-family:Calibri,Helvetica,sans-serif"><br>
</span></font></div>
<div style="font-size:12pt;color:rgb(0,0,0);background-color:rgb(255,255,255)">
<font face="monospace"><span style="font-family:Calibri,Helvetica,sans-serif"><span style="color:rgb(0,0,0);background-color:rgb(255,255,255);display:inline">getSubroutineForAddress</span> calls </span><span style="font-family:Calibri,Helvetica,sans-serif">DWARFUnit::updateAddressDieMap.</span><span style="font-family:Calibri,Helvetica,sans-serif"><br>
</span><span style="font-family:Calibri,Helvetica,sans-serif">As part of </span><span style="font-family:Calibri,Helvetica,sans-serif">DWARFUnit::updateAddressDieMap we get this sequence of calls:</span></font></div>
<div style="font-size:12pt;color:rgb(0,0,0);background-color:rgb(255,255,255)">
<font face="Calibri, Helvetica, sans-serif">DWARFDie::getAddressRanges() → DWARFDie::getLowAndHighPC → toSectionedAddress → DWARFFormValue::getAsAddress() → DWARFUnit::getAddrOffsetSectionItem</font></div>
<div style="font-size:12pt;color:rgb(0,0,0);background-color:rgb(255,255,255)">
<font face="Calibri, Helvetica, sans-serif"><br>
</font></div>
<div style="font-size:12pt;color:rgb(0,0,0);background-color:rgb(255,255,255)">
<font face="Calibri, Helvetica, sans-serif">The <span style="color:rgb(0,0,0);background-color:rgb(255,255,255);display:inline">DWARFUnit::getAddrOffsetSectionItem returns NONE (to circle back to original post) because in this DWO CU IsDWO flag is
set, it then tries to parse NormalUnits. Except now it gets un-relocated Skelton CU from .o/.dwo, and it invokes <span style="color:rgb(0,0,0);background-color:rgb(255,255,255);display:inline">DWARFUnit::getAddrOffsetSectionItem on that. Since AddrOffsetSectionBase
is not set it returns NONE.<br>
<br>
So we basically start from relocated Skeleton CU we got from binary debug information, create DWO CU from .o/.dwo, we then create Skeleton CU from .o/.dwo and try to get address from it. Since .debug_addr is in the binary, <span style="color:rgb(0,0,0);background-color:rgb(255,255,255);display:inline">and
we never set correct section/offset in that Sketon CU</span> clearly that doesn't work.<br>
<br>
My usage model.<br>
It is more from bottom up as you have mentioned. This is because of what needs to be done. Bolt moves functions around, hoists out cold sections into their own functions, etc. It also converts low_pc/high_pc to ranges. So .debug_ranges, .debug_addr, .debug_loc
are completely re-written. We then update every reference in DIE with new value of <span style="color:rgb(0,0,0);font-family:Calibri,Arial,Helvetica,sans-serif;background-color:rgb(255,255,255);display:inline"><span style="color:rgb(0,0,0);background-color:rgb(255,255,255);display:inline">DW_AT_low_pc</span> or
modify <span style="color:rgb(0,0,0);background-color:rgb(255,255,255);display:inline">DW_AT_low_pc/<span style="color:rgb(0,0,0);background-color:rgb(255,255,255);display:inline">DW_AT_high_pc to range semantic. <span style="color:rgb(0,0,0);font-family:Calibri,Helvetica,sans-serif;background-color:rgb(255,255,255);display:inline">This
means that we need to iterate over every DIE get original address map it to new address/addresses and update the DIE. Both in CUs in binary (in case of monolithic or fission + -fsplit-dwarf-inlining), and in .debug_info.dwo CUs.</span><br>
</span></span></span><br>
For example, when processing DW_TAG_inline_subroutine a <span style="color:rgb(0,0,0);background-color:rgb(255,255,255);display:inline">DWARFDie::getAddressRanges()<span> is invoked. Which follows the same execution path as when it is invoked in
symbolizer and hits the same problem.</span></span><br>
<br>
</span></span></font></div>
<div style="font-size:12pt;color:rgb(0,0,0);background-color:rgb(255,255,255)">
<font face="Calibri, Helvetica, sans-serif">Now I can get raw index Value from DIE, then look up address in Skeleton CU with </font><span style="font-family:Calibri,Helvetica,sans-serif;font-size:12pt;font-style:normal;font-variant-ligatures:normal;font-variant-caps:normal;font-weight:normal">getAddrOffsetSectionItem,
but it exposes extra complexity.</span></div>
<div style="font-size:12pt;color:rgb(0,0,0);background-color:rgb(255,255,255)">
<font face="Calibri, Helvetica, sans-serif"><br>
</font></div>
<div style="font-size:12pt;color:rgb(0,0,0);background-color:rgb(255,255,255)">
<font face="Calibri, Helvetica, sans-serif">I think if we can do a fix "under the hood" it will simplify things and looks like will help tools like symbolizer also.</font></div>
<div style="font-size:12pt;color:rgb(0,0,0);background-color:rgb(255,255,255)">
<font face="Calibri, Helvetica, sans-serif"><br>
</font></div>
<div style="font-size:12pt;color:rgb(0,0,0);background-color:rgb(255,255,255)">
<font face="Calibri, Helvetica, sans-serif">To iterate the two patches are just to start the discussion, maybe a more extensive refactoring is necessary. Working on bolt and looking at symoblizer (or at least part of it) I don't quite understand logic in <span style="color:rgb(0,0,0);background-color:rgb(255,255,255);display:inline">getAddrOffsetSectionItem.
Doesn't seem like it works at least in those usage models. </span><br>
<br>
Alex<br>
<br>
</font><br>
</div>
<div id="gmail-m_-593140788157315884x_gmail-m_3019932919336458806appendonsend"></div>
<hr style="display:inline-block;width:98%">
<div id="gmail-m_-593140788157315884x_gmail-m_3019932919336458806divRplyFwdMsg" dir="ltr"><font face="Calibri, sans-serif" color="#000000" style="font-size:11pt"><b>From:</b> David Blaikie <<a href="mailto:dblaikie@gmail.com" target="_blank">dblaikie@gmail.com</a>><br>
<b>Sent:</b> Wednesday, February 24, 2021 4:42 PM<br>
<b>To:</b> Alexander Yermolovich <<a href="mailto:ayermolo@fb.com" target="_blank">ayermolo@fb.com</a>><br>
<b>Cc:</b> Pavel Labath <<a href="mailto:pavel@labath.sk" target="_blank">pavel@labath.sk</a>>;
<a href="mailto:llvm-dev@lists.llvm.org" target="_blank">llvm-dev@lists.llvm.org</a> <<a href="mailto:llvm-dev@lists.llvm.org" target="_blank">llvm-dev@lists.llvm.org</a>><br>
<b>Subject:</b> Re: [llvm-dev] Extracting LocList address ranges from DWO .debug_info</font>
<div> </div>
</div>
<div>
<div dir="ltr">
<div dir="ltr"><br>
</div>
<br>
<div>
<div dir="ltr">On Mon, Feb 22, 2021 at 10:50 AM Alexander Yermolovich <<a href="mailto:ayermolo@fb.com" target="_blank">ayermolo@fb.com</a>> wrote:<br>
</div>
<blockquote style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
<div dir="ltr">
<div style="font-family:Calibri,Arial,Helvetica,sans-serif;font-size:12pt;color:rgb(0,0,0);background-color:rgb(255,255,255)">
Hello David.<br>
<br>
My apologies, let me provide some context. I am helping with BOLT binary optimizer (soon to be upstreamed). As part of its functionality it updates debug information to reflect the changes it had made to the binary. Moving functions around, extracting cold
blocks, ICF, etc.<br>
Right now, it works with monolithic Debug information, but not with Fission one.<br>
<br>
It completely re-writes debug line, ranges/aranges, and patches relevant DIEs entries to point to new offsets within those sections. Which means finding what current addresses are in DIE, mapping them to new addresses and from that new offsets within sections.
For debug fission it also will need to re-write .debug_addr and update indices that point to it.<br>
<br>
I looked at llvm-symbolizer and this seems a bit high level. </div>
</div>
</blockquote>
<div><br>
It is, but somewhere down there it has to follow from executable to dwo/dwp files - that part of its implementation might be able to be reused (may benefit/require some refactoring to make it more reusable) for the purposes you have. I'd suggest looking there
first, if you have a chance.<br>
<br>
Perhaps that looks like refactoring llvm-symbolizer to use a codepath that looks like the ones you're already using, and making that work with dwos in a way that it doesn't already - or changing your code to more like some aspects of llvm-symbolizer's implementation
and follow that codepath.<br>
<br>
So llvm-symbolizer goes down through LLVMSymbolizer::symbolizeInlinedCodeCommon -> SymbolizableObjectFile::symbolizeInlinedCode -> DWARFContext::getInliningInfoForAddress<br>
<br>
It looks like this code does correctly stitch together the addr and ranges tables in "parseDWO" (where it calls setAddrOffsetSection/setRangesSection).<br>
<br>
But it sounds like you're trying to go from loading DWARFContext for dwo/dwp files directly, back to the skeleton/executable - it may be better to go forward instead of backwards? Load up the DWARFContext for the linked executable, then walk the (possibly skeleton)
units there, and parseDWO/getDWO to walk into the split units - and those split units, loaded that way, should have their addr table working correctly due to the parseDWO code?<br>
</div>
<blockquote style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
<div dir="ltr">
<div style="font-family:Calibri,Arial,Helvetica,sans-serif;font-size:12pt;color:rgb(0,0,0);background-color:rgb(255,255,255)">
So usage model is closer to 1) I think.</div>
<div style="font-family:Calibri,Arial,Helvetica,sans-serif;font-size:12pt;color:rgb(0,0,0);background-color:rgb(255,255,255)">
<br>
</div>
<div style="font-family:Calibri,Arial,Helvetica,sans-serif;font-size:12pt;color:rgb(0,0,0);background-color:rgb(255,255,255)">
Right now there is no link, but one solution would be to add it, when <i style="font-size:12pt;font-variant-ligatures:inherit;font-variant-caps:inherit;font-weight:inherit">getNonSkeletonUnitDIE/parseDWO </i>is called. This reflects the code in getAddrOffsetSection
that tries to parse normal CUs current DWARFUnit is DWO. I don't know what original intent of that code was, but as it stands, I don't think it works because it parses none relocated skeleton CU in A.o.</div>
<div style="font-family:Calibri,Arial,Helvetica,sans-serif;font-size:12pt;color:rgb(0,0,0);background-color:rgb(255,255,255)">
Rough idea: *<br>
<a href="https://reviews.llvm.org/D96826" id="gmail-m_-593140788157315884x_gmail-m_3019932919336458806x_gmail-m_-5881933187940789829LPlnk980851" target="_blank">https://reviews.llvm.org/D96826</a><br>
<br>
Alternative, that whole code can be skipped entirely. *<br>
<a href="https://reviews.llvm.org/D96827" id="gmail-m_-593140788157315884x_gmail-m_3019932919336458806x_gmail-m_-5881933187940789829LPlnk573221" target="_blank">https://reviews.llvm.org/D96827</a><br>
</div>
<div style="font-family:Calibri,Arial,Helvetica,sans-serif;font-size:12pt;color:rgb(0,0,0);background-color:rgb(255,255,255)">
<br>
</div>
<div style="font-family:Calibri,Arial,Helvetica,sans-serif;font-size:12pt;color:rgb(0,0,0);background-color:rgb(255,255,255)">
This works because in parseDWO we set AddrOffsetSectionBase, and AddrOffsetSection from .debug_addr in binary. Then in getAddrOffsetSectionItem we have all the information to get addresses from indices. One weird part is that DWARFDataExtractor is created with
A.o file, while AddrOffsetSection is from A binary.<br>
<br>
The getAddrOffsetSectionItem is an important low level API. For example, it is also used by DWARFUnit::getLowandHighPC, along with DWARFDie::getLocations, DWARFUnit::findLocationLIstFromOffset. So, making a fix at that level, would make other more high-level
APIs work for DWO contents.</div>
<div style="font-family:Calibri,Arial,Helvetica,sans-serif;font-size:12pt;color:rgb(0,0,0);background-color:rgb(255,255,255)">
<br>
</div>
<div style="font-family:Calibri,Arial,Helvetica,sans-serif;font-size:12pt;color:rgb(0,0,0);background-color:rgb(255,255,255)">
*Diffs are same ones as previously mentioned.</div>
<div style="font-family:Calibri,Arial,Helvetica,sans-serif;font-size:12pt;color:rgb(0,0,0);background-color:rgb(255,255,255)">
<br>
Alex</div>
<div style="font-family:Calibri,Arial,Helvetica,sans-serif;font-size:12pt;color:rgb(0,0,0);background-color:rgb(255,255,255)">
<br>
</div>
<div style="font-family:Calibri,Arial,Helvetica,sans-serif;font-size:12pt;color:rgb(0,0,0);background-color:rgb(255,255,255)">
"</div>
<div style="font-family:Calibri,Arial,Helvetica,sans-serif;font-size:12pt;color:rgb(0,0,0);background-color:rgb(255,255,255)">
<span style="font-family:Calibri,Arial,Helvetica,sans-serif;font-size:12pt">Sorry I'm not really following all these pieces.</span>
<pre style="color:rgb(0,0,0)">There's two basic ways these APIs are predominantly used:
1) llvm-dwarfdump: This opens one file/context at a time, and generally
doesn't open other files - such as dwos or o/exe for skeleton. (indeed,
there's no reliable way to find a skeleton, given a dwo - only to find dwos
given skeletons)
2) llvm-symbolizer: this opens executable files (or .o files) and from
there can load dwo/dwp/dsym related files as needed
What sort of use case do you have? I guess it can/should look something
like (2) so can you use the LLVM debug info APIs in a similar manner to
llvm-symbolizer to achieve your goals?</pre>
"</div>
<div id="gmail-m_-593140788157315884x_gmail-m_3019932919336458806x_gmail-m_-5881933187940789829appendonsend">
</div>
<hr style="display:inline-block;width:98%">
<div id="gmail-m_-593140788157315884x_gmail-m_3019932919336458806x_gmail-m_-5881933187940789829divRplyFwdMsg" dir="ltr">
<font face="Calibri, sans-serif" color="#000000" style="font-size:11pt"><b>From:</b> David Blaikie <<a href="mailto:dblaikie@gmail.com" target="_blank">dblaikie@gmail.com</a>><br>
<b>Sent:</b> Monday, February 15, 2021 10:09 PM<br>
<b>To:</b> Alexander Yermolovich <<a href="mailto:ayermolo@fb.com" target="_blank">ayermolo@fb.com</a>>; Pavel Labath <<a href="mailto:pavel@labath.sk" target="_blank">pavel@labath.sk</a>><br>
<b>Cc:</b> <a href="mailto:llvm-dev@lists.llvm.org" target="_blank">llvm-dev@lists.llvm.org</a> <<a href="mailto:llvm-dev@lists.llvm.org" target="_blank">llvm-dev@lists.llvm.org</a>><br>
<b>Subject:</b> Re: [llvm-dev] Extracting LocList address ranges from DWO .debug_info</font>
<div> </div>
</div>
<div><font size="2"><span style="font-size:11pt">
<div>This stuff is a bit ad-hoc at best.<br>
<br>
I believe some of these APIs have been generalized enough to be usable<br>
for your use-case, but it might be at a lower level - specifically I<br>
think the loclist infrastructure is used by lldb when parsing DWARFv5.<br>
But it might be used without some of the LLVM DWARF Unit abstractions<br>
you're using. (those abstractions are used in llvm-dwarfdump - which<br>
often isn't dealing with both .o and .dwo, but only dumping one of the<br>
files & doing what it can (or sometimes dumping one file containing<br>
both sets of sections, in which case it can do some address lookup,<br>
etc, more conveniently))<br>
<br>
On Fri, Feb 12, 2021 at 6:07 PM Alexander Yermolovich via llvm-dev<br>
<<a href="mailto:llvm-dev@lists.llvm.org" target="_blank">llvm-dev@lists.llvm.org</a>> wrote:<br>
><br>
> Hello<br>
><br>
> I am wondering if this is a bug, or more likely something I am doing wrong/using wrong APIs.<br>
> I have binary A, and object file A.o, compiled with Clang debug fission single mode. So .dwo sections are in the object file. Although with split mode it would bre the same behavior.<br>
> Relevant parts of the code:<br>
> for (const auto &CU : DwCtx->compile_units()) {<br>
> auto *const DwarfUnit = CU.get();<br>
> if (llvm::Optional<uint64_t> DWOId = DwarfUnit->getDWOId()) {<br>
> auto *CUDWO = static_cast<DWARFCompileUnit*>(DwarfUnit->getNonSkeletonUnitDIE(false).getDwarfUnit());<br>
> ...<br>
> }<br>
> }<br>
><br>
> Later in the code I iterate over DIEs for .debug_info.dwo and call<br>
> DIE.getLocations(dwarf::DW_AT_location);<br>
><br>
> Alternatively can manually extract offset and call<br>
> CUnit->findLoclistFromOffset(Offset);<br>
><br>
> It fails because it tries to look up address using DWARFUnit in NormalUnits that it extracts from A.o.<br>
> Under the hood vistAsoluteLocationList is called with getAddrOffsetSectionItem passed in.<br>
> Since this DWARFUnit is DWO, it invokes Context.info_section_units(). Which uses A.o to create DW_SECT_INFO and DW_SECT_EXT_TYPES.<br>
> Then calls itself, but from the newly constructed Debug DWARFUnit. The skeleton CU that is in A.o.<br>
><br>
> Since the way it's constructed the AddrOffsetSectionBase is never set, so getAddrOffsetSectionItem returns None. Eventually error is returned from high level API call.<br>
><br>
> I ended up doing this to get address ranges:<br>
> DWARFLocationExpressionsVector LocEVector;<br>
> auto CallBack = [&](const DWARFLocationEntry &Entry) -> bool {<br>
> auto StartAddress =<br>
> BaseUnit->getAddrOffsetSectionItem(Entry.Value0);<br>
> if (!StartAddress) {<br>
> //TODO: Handle Error<br>
> return false;<br>
> }<br>
> LocEVector.emplace_back(DWARFLocationExpression{DWARFAddressRange{<br>
> (*StartAddress).Address, (*StartAddress).Address + Entry.Value1,<br>
> Entry.SectionIndex}, Entry.Loc});<br>
> return true;<br>
> };<br>
><br>
> if(Unit->getLocationTable().visitLocationList(&Offset, CallBack))<br>
> ...<br>
><br>
><br>
> But back to original API calls. Are they just not designed to work with DWO CUs, or am I missing something?<br>
><br>
> Even if AddrOffsetSectionBase was set to 0, the address section it is accessing is in A.o and is not relocated. One would still need to get base address from the address from Skeleton CU to get fully resolved address ranges, or what I did to use index to
access binary .debug_addr section directly (with appropriate AddrOffsetSectionBase).<br>
><br>
> Thank You<br>
> Alex<br>
> _______________________________________________<br>
> LLVM Developers mailing list<br>
> <a href="mailto:llvm-dev@lists.llvm.org" target="_blank">llvm-dev@lists.llvm.org</a><br>
> <a href="https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev" target="_blank">https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev</a> <br>
</div>
</span></font></div>
</div>
</blockquote>
</div>
</div>
</div>
</div>
</blockquote>
</div>
</div>
</div>
</blockquote></div>