<div dir="ltr"><br><div class="gmail_extra"><br><br><div class="gmail_quote">On Tue, Oct 15, 2013 at 1:31 PM, Eric Christopher <span dir="ltr"><<a href="mailto:echristo@gmail.com" target="_blank">echristo@gmail.com</a>></span> wrote:<br>

<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div class="HOEnZb"><div class="h5">On Tue, Oct 15, 2013 at 1:29 PM, Manman Ren <<a href="mailto:manman.ren@gmail.com">manman.ren@gmail.com</a>> wrote:<br>


><br>

><br>

><br>

> On Tue, Oct 15, 2013 at 11:34 AM, David Blaikie <<a href="mailto:dblaikie@gmail.com">dblaikie@gmail.com</a>> wrote:<br>

>><br>

>><br>

>><br>

>><br>

>> On Tue, Oct 15, 2013 at 10:51 AM, Manman Ren <<a href="mailto:manman.ren@gmail.com">manman.ren@gmail.com</a>> wrote:<br>

>>><br>

>>><br>

>>><br>

>>><br>

>>> On Tue, Oct 15, 2013 at 10:10 AM, David Blaikie <<a href="mailto:dblaikie@gmail.com">dblaikie@gmail.com</a>><br>

>>> wrote:<br>

>>>><br>

>>>><br>

>>>><br>

>>>><br>

>>>> On Tue, Oct 15, 2013 at 10:05 AM, Manman Ren <<a href="mailto:manman.ren@gmail.com">manman.ren@gmail.com</a>><br>

>>>> wrote:<br>

>>>>><br>

>>>>><br>

>>>>><br>

>>>>><br>

>>>>> On Wed, Oct 9, 2013 at 5:22 PM, Manman Ren <<a href="mailto:manman.ren@gmail.com">manman.ren@gmail.com</a>><br>

>>>>> wrote:<br>

>>>>>><br>

>>>>>><br>

>>>>>><br>

>>>>>><br>

>>>>>> On Wed, Oct 9, 2013 at 1:32 PM, Manman Ren <<a href="mailto:manman.ren@gmail.com">manman.ren@gmail.com</a>><br>

>>>>>> wrote:<br>

>>>>>>><br>

>>>>>>><br>

>>>>>>> David,<br>

>>>>>>><br>

>>>>>>> Thanks for reviewing!<br>

>>>>>>><br>

>>>>>>> On Wed, Oct 9, 2013 at 11:36 AM, David Blaikie <<a href="mailto:dblaikie@gmail.com">dblaikie@gmail.com</a>><br>

>>>>>>> wrote:<br>

>>>>>>>><br>

>>>>>>>> Might be easier if these were on Phabricator, but here are some<br>

>>>>>>>> thoughts:<br>

>>>>>>>><br>

>>>>>>>> 0001:<br>

>>>>>>>>   This patch generally, while separated for legibility, is untested<br>

>>>>>>>> & difficult to discuss in isolation.<br>

>>>>>>><br>

>>>>>>> I agree, this patch adds the functionality but does not use it, the<br>

>>>>>>> 2nd patch uses ref_addr.<br>

>>>>>>> If you think I should merge the two and commit as a single patch, let<br>

>>>>>>> me know.<br>

>>>>>>><br>

>>>>>>>><br>

>>>>>>>> I may need to refer to the second patch in reviewing this first one.<br>

>>>>>>>>   DwarfDebug.cpp:<br>

>>>>>>>>     computeSizeAndOffsets:<br>

>>>>>>>>       I believe this produces the wrong offset for the 3rd CU and<br>

>>>>>>>> onwards. computeSizeAndOffset returns the EndOffset which is absolute, not<br>

>>>>>>>> relative to the Offset passed in, so it should be assigned to SecOffset, not<br>

>>>>>>>> added to it. (eg: if you have CUs at 0, 42, and 57 - the first pass through<br>

>>>>>>>> SecOffset will be zero, then it'll be 0 + 42, then on the 3rd it'll be (0 +<br>

>>>>>>>> 42) + 57 which isn't right, it should just be 57). Please add more test<br>

>>>>>>>> coverage while fixing this issue.<br>

>>>>>>><br>

>>>>>>><br>

>>>>>>> computeSizeAndOffset takes an offset that is relative to the start of<br>

>>>>>>> the CU and returns the offset relative to the CU after laying out the DIE.<br>

>>>>>>> The initial offset before laying out the CU DIE is the header size,<br>

>>>>>>> EndOffset will be the offset relative to the CU after laying out the whole<br>

>>>>>>> CU DIE.<br>

>>>>>>> We can think of EndOffset as the size of the whole CU DIE. SecOffset<br>

>>>>>>> is the offset from the Debug Info section, so we can update it by adding the<br>

>>>>>>> CU size.<br>

>>>>>>><br>

>>>>>>>   // Offset from the beginning of debug info section.<br>

>>>>>>>   unsigned SecOffset = 0;<br>

>>>>>>>   for (SmallVectorImpl<CompileUnit *>::iterator I = CUs.begin(),<br>

>>>>>>>          E = CUs.end(); I != E; ++I) {<br>

>>>>>>>     (*I)->setDebugInfoOffset(SecOffset);<br>

>>>>>>>     unsigned Offset =<br>

>>>>>>>       sizeof(int32_t) + // Length of Compilation Unit Info<br>

>>>>>>>       sizeof(int16_t) + // DWARF version number<br>

>>>>>>>       sizeof(int32_t) + // Offset Into Abbrev. Section<br>

>>>>>>>       sizeof(int8_t);   // Pointer Size (in bytes)<br>

>>>>>>><br>

>>>>>>>     unsigned EndOffset = computeSizeAndOffset((*I)->getCUDie(),<br>

>>>>>>> Offset);<br>

>>>>>>>     SecOffset += EndOffset;<br>

>>>>>>>   }<br>

>>>>>><br>

>>>>>><br>

>>>>>> Added more comments in attached patch.<br>

>>>>>><br>

>>>>>>><br>

>>>>>>><br>

>>>>>>>><br>

>>>>>>>> Eric/Manman: rough design question: compute the absolute offset of<br>

>>>>>>>> each CU within the debug_info section and describe them all relative to a<br>

>>>>>>>> single symbol at the start of the debug_info section, or should we put a<br>

>>>>>>>> label at the start of each CU?<br>

>>>>>>><br>

>>>>>>><br>

>>>>>>> Either way should work. But since we already have the section offset<br>

>>>>>>> for each CU, describing relative to the start of debug_info section saves us<br>

>>>>>>> a few symbols :)<br>

>>>>>>><br>

>>>>>>>><br>

>>>>>>>><br>

>>>>>>>> 0002:<br>

>>>>>>>>   ref_addr_relocation.ll:<br>

>>>>>>>>     seems a bit vague in terms of how you test for the relocation. I<br>

>>>>>>>> think it'd make more sense to test the assembly, than the reafobj output,<br>

>>>>>>>> that way you can test that the correct bytes have the relocation rather than<br>

>>>>>>>> just that there's "some" .debug_info relocation in the file. For an example,<br>

>>>>>>>> see test/DebugInfo/X86/tls.ll I wrote - it has some "unannotated" bytes<br>

>>>>>>>> because we still don't have a nice way to annotate location bytes that are<br>

>>>>>>>> DWARF expressions, but it's close - I guess those should be CHECK-NEXTs,<br>

>>>>>>>> though. In any case, you should be able to check a few lines of assembly<br>

>>>>>>>> with the # DW_AT/DW_TAG annotation comments.<br>

>>>>>>><br>

>>>>>>><br>

>>>>>>> I can check for ".quad .Lsection_info+38 #DW_AT_type".<br>

>>>>>><br>

>>>>>> Done in attached patch.<br>

>>>>>>><br>

>>>>>>><br>

>>>>>>>><br>

>>>>>>>>     You'd need to add the tu3.cpp from my example if you want to<br>

>>>>>>>> demonstrate that the relocation is actually working as intended and avoiding<br>

>>>>>>>> the bogus result I showed.<br>

>>>>>>>>   type-unique-simple-a.ll<br>

>>>>>>>>     While I agree that having common test cases helps reduce the<br>

>>>>>>>> number of separate invocations we have, this test case seems like it might<br>

>>>>>>>> be becoming a little hard to follow what's under test. Just going from the<br>

>>>>>>>> diff I'm not sure what's what. Could you give a brief description of the<br>

>>>>>>>> state of type-unique-simple2 files? What's involved? What's it meant to<br>

>>>>>>>> test? What's it actually testing?<br>

>>>>>>><br>

>>>>>>> I can add more comments. The source files are included in the testing<br>

>>>>>>> case. I am checking that llvm-link only generates a single copy of the<br>

>>>>>>> struct and the backend generates a single DIE and uses ref_addr.<br>

>>>>>>> The changes are to check "the backend generates a single DIE and uses<br>

>>>>>>> ref_addr".<br>

>>>>>><br>

>>>>>> Done in attached patch.<br>

>>>>>>><br>

>>>>>>><br>

>>>>>>>>   DIE.h:<br>

>>>>>>>>     checkCompileUnit could probably be called<br>

>>>>>>>> "getCompileUnitOrNull", the name "check*" seems to imply that it does some<br>

>>>>>>>> checking, which isn't true.<br>

>>>>>>><br>

>>>>>>> Will do.<br>

>>>>>><br>

>>>>>> Done in attached patch.<br>

>>>>>>><br>

>>>>>>><br>

>>>>>>>><br>

>>>>>>>>   DwarfCompileUnit.cpp:<br>

>>>>>>>>     the "istype || issubprogram" check should probably be pulled out<br>

>>>>>>>> into a separate function, something like "isShareableAcrossCUs" or something<br>

>>>>>>>> like that (please, that's not the best name, let's discuss it further before<br>

>>>>>>>> we settle on a name) so that getDIE and insertDIE are sure to use the same<br>

>>>>>>>> test at all times.<br>

>>>>>>><br>

>>>>>>> Yes, the condition is shared between getDIE and insertDIE. I like<br>

>>>>>>> isSharableAcrossCUs, because that is why the map is in DwarfDebug instead of<br>

>>>>>>> CompileUnit.<br>

>>>>>><br>

>>>>>> Done in attached patch.<br>

>>>>>>><br>

>>>>>>><br>

>>>>>>>><br>

>>>>>>>>   Why does addDIEEntry take a form? When does the caller get to<br>

>>>>>>>> choose the form rather than the callee deciding between ref4 and ref_addr<br>

>>>>>>>> based on context?<br>

>>>>>>><br>

>>>>>>><br>

>>>>>>> addDIEEntry took a form before my change and I didn't check why it<br>

>>>>>>> did.<br>

>>>>>>> I will check if all callers always use ref4, if it it true, I will<br>

>>>>>>> submit a cleanup patch first.<br>

>>>>>><br>

>>>>>> Done in attached patch.<br>

>>>>>>><br>

>>>>>>><br>

>>>>>>>>   I'm still unsure about this worklist thing - do your current tests<br>

>>>>>>>> cover the need for the worklist? ie: if we removed that logic, would tests<br>

>>>>>>>> fail? Can you describe a specific sequence where the worklist is necessary?<br>

>>>>>>><br>

>>>>>>><br>

>>>>>>> If we are sure that DIEs are always added to an owner before calling<br>

>>>>>>> addDIEEntry, we don't need the worklist.<br>

>>>>>>> But I saw cases where that was not true, I will get a reduced testing<br>

>>>>>>> case.<br>

>>>>>><br>

>>>>>><br>

>>>>>> If we try to assert both DIEs have owner in addDIEEntry, the following<br>

>>>>>> testing cases will fail:<br>

>>>>>>     LLVM :: DebugInfo/X86/multiple-aranges.ll<br>

>>>>>>     LLVM :: DebugInfo/X86/ref_addr_relocation.ll<br>

>>>>>>     LLVM :: DebugInfo/X86/stmt-list-multiple-compile-units.ll<br>

>>>>>>     LLVM :: DebugInfo/two-cus-from-same-file.ll<br>

>>>>>>     LLVM :: Linker/type-unique-simple-a.ll<br>

>>>>>>     LLVM :: Linker/type-unique-simple2.ll<br>

>>>>>><br>

>>>>>> For ref_addr_relocation, it failed in<br>

>>>>>> #5  0x0000000100b169ba in llvm::DwarfDebug::addDIEEntry<br>

>>>>>> (this=0x103023600, Die=0x102e14090, Attribute=73, Entry=0x10302a9d0) at<br>

>>>>>> /Users/manmanren/llvm_git/llvm/lib/CodeGen/AsmPrinter/DwarfDebug.cpp:3071<br>

>>>>>> #6  0x0000000100b040e0 in llvm::CompileUnit::addType<br>

>>>>>> (this=0x102e13ec0, Entity=0x102e14090, Ty={<llvm::DIScope> =<br>

>>>>>> {<llvm::DIDescriptor> = {DbgNode = 0x102e05f30}, <No data fields>}, <No data<br>

>>>>>> fields>}, Attribute=73) at<br>

>>>>>> /Users/manmanren/llvm_git/llvm/lib/CodeGen/AsmPrinter/DwarfCompileUnit.cpp:910<br>

>>>>>> #7  0x0000000100b05bff in llvm::CompileUnit::createGlobalVariableDIE<br>

>>>>>> (this=0x102e13ec0, N=0x102e068c0) at<br>

>>>>>> /Users/manmanren/llvm_git/llvm/lib/CodeGen/AsmPrinter/DwarfCompileUnit.cpp:1505<br>

>>>>>><br>

>>>>>> If we look at DwarfCompileUnit.cpp:<br>

>>>>>>     VariableDIE = new DIE(GV.getTag());<br>

>>>>>>     // Add to map.<br>

>>>>>>     insertDIE(N, VariableDIE);<br>

>>>>>><br>

>>>>>>     // Add name and type.<br>

>>>>>>     addString(VariableDIE, dwarf::DW_AT_name, GV.getDisplayName());<br>

>>>>>>     addType(VariableDIE, GTy);<br>

>>>>>><br>

>>>>>> The VariableDIE is not added to an owner yet when calling addType.<br>

>>>>><br>

>>>>><br>

>>>>> I believe I have addressed all review comments, the patches are<br>

>>>>> re-attached here for convenience.<br>

>>>><br>

>>>><br>

>>>> So I'm still thinking about the work list work.<br>

>>>><br>

>>>> If we don't know which CU a DIE is in - isn't it, necessarily, going to<br>

>>>> be in the current CU (& thus referenced by ref_data4 (using a CU-local<br>

>>>> offset), not ref_addr)?<br>

>>><br>

>>><br>

>>> That may be true. But can we prove that?<br>

>><br>

>><br>

>> We really shouldn't add extra complexity to the codebase just because we<br>

>> don't know how the codebase works - that's what leads to the kind of<br>

>> complexity we see in the debug info handling today.<br>

>><br>

>>><br>

>>> There are two ways of handling this:<br>

>>> 1> use a worklist to be conservative<br>

>>> 2> do not use a worklist, but add an assertion when emitting a DIE A,<br>

>>> make sure the DIE referenced with ref4 is indeed in the same CU as DIE A.<br>

>><br>

>><br>

>> Please just add this assertion. If it fires we'll have a test case and a<br>

>> concrete justification for this complexity such that should anyone remove it<br>

>> later because it looked unnecessary, they'll at least have a failing test to<br>

>> explain why it was there in the first place.<br>

>><br>

>>><br>

>>><br>

>>> Let me know which one you prefer.<br>

>>><br>

>>> Do you have any comments on the ref_addr patch (the 1st patch of the<br>

>>> two)?<br>

>><br>

>><br>

>> Nothing much - I wouldn't mind Eric taking a look (& would rather you not<br>

>> commit this until you're committing the second patch, since it's otherwise<br>

>> untested/unjustified code) on the label/offset related stuff since I'm less<br>

>> familiar with that.<br>

><br>

><br>

> Eric,<br>

><br>

> Do you want to take a look at the 1st patch?<br>

><br>

<br>

</div></div>I do, I've been following the thread, but give me a day or so.<br></blockquote><div><br></div><div>Sure :)</div><div><br></div><div>Thanks,</div><div>Manman</div><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">


<br>

Thanks.<br>

<span class="HOEnZb"><font color="#888888"><br>

-eric<br>

</font></span><div class="HOEnZb"><div class="h5"><br>

>><br>

>><br>

>> There's one or two cases of {} on single-statement blocks you could fix<br>

>> up.<br>

><br>

> I couldn't find that.<br>

> Are you referring to:<br>

><br>

> +      } else {<br>

><br>

> +        Asm->EmitInt32(Addr);<br>

><br>

> +      }<br>

><br>

> It has a multi-statement if, I thought the rule is to have a parenthesis for<br>

> the else as well.<br>

><br>

>><br>

>><br>

>> "DebugInfoOffset" (both the member and the functions to set/get it) might<br>

>> be more meaningfully named "SectionOffset" - but I'm not sure. Eric? Other<br>

>> names (DebugInfoSectionOffset?)?<br>

><br>

><br>

> Eric, any opinion here?<br>

><br>

> Thanks,<br>

> Manman<br>

>><br>

>><br>

>><br>

>> - David<br>

><br>

><br>

</div></div></blockquote></div><br></div></div>