[PATCH] D32648: Remove line and file from DINamespace

Fri Apr 28 14:27:06 PDT 2017

SGTM.

> On Apr 28, 2017, at 14:26, Adrian Prantl <aprantl at apple.com> wrote:
> 
> In the patch I uploaded a minute ago, there is now a comment in DIBuilder, where they get created.
> 
> -- adrian
>> On Apr 28, 2017, at 2:25 PM, Duncan P. N. Exon Smith <dexonsmith at apple.com <mailto:dexonsmith at apple.com>> wrote:
>> 
>>> 
>>> On Apr 28, 2017, at 14:12, Adrian Prantl <aprantl at apple.com <mailto:aprantl at apple.com>> wrote:
>>> 
>>>> 
>>>> On Apr 28, 2017, at 2:05 PM, Duncan P. N. Exon Smith <dexonsmith at apple.com <mailto:dexonsmith at apple.com>> wrote:
>>>> 
>>>>> 
>>>>> On Apr 28, 2017, at 14:02, Adrian Prantl <aprantl at apple.com <mailto:aprantl at apple.com>> wrote:
>>>>> 
>>>>>> 
>>>>>> On Apr 28, 2017, at 1:10 PM, David Blaikie <dblaikie at gmail.com <mailto:dblaikie at gmail.com>> wrote:
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> On Fri, Apr 28, 2017 at 12:33 PM Adrian Prantl <aprantl at apple.com <mailto:aprantl at apple.com>> wrote:
>>>>>>> On Apr 28, 2017, at 11:55 AM, David Blaikie <dblaikie at gmail.com <mailto:dblaikie at gmail.com>> wrote:
>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>>> On Fri, Apr 28, 2017 at 11:17 AM Adrian Prantl via Phabricator <reviews at reviews.llvm.org <mailto:reviews at reviews.llvm.org>> wrote:
>>>>>>> aprantl added a comment.
>>>>>>> 
>>>>>>> In https://reviews.llvm.org/D32648#740966 <https://reviews.llvm.org/D32648#740966>, @dblaikie wrote:
>>>>>>> 
>>>>>>> > I'm all for this, couple of questions though:
>>>>>>> >
>>>>>>> > - When would the use of "distinct" on an anonymous namespace matter? Since LLVM uses cross-CU references to refer to entities in different CUs, never importing/moving entities between CUs, the anonymous namespaces shouldn't collide, should they? (anymore than named namespaces would collide when there are different entities in different CUs but both in the same named namespace - LLVM still produces two separate namespaces, one in each CU, one with each of the respective entities)?
>>>>>>> 
>>>>>>> 
>>>>>>> I believe the `distinct`  is necessary because anonymous namespaces whose parent scope is the CU have a `null` scope. (This is what DIBuilder does to every type node to allow for type uniquing.) If the anonymous namespace are not distinct, all top-level anonymous namespaces would be uniqued. Thus two types from different CU's that are in different anonymous namespaces would both have the same scope pointing to the uniqued anonymous namespace. It is not unthinkable that this could happen to work correctly because of how DwarfDebug is processing the data, but I'd rather not rely on this.
>>>>>>> 
>>>>>>> I'd rather not add code that doesn't have a pretty clear rationale. "I'm not sure it'll work" leads to weird code that's hard to understand/explain because it's not necessarily founded in any actual need. (I think it was Marshall Clow who gave a good talk about "fear based programming" at cppcon a few years back which is a perspective I have a strong affinity/appreciation for)
>>>>>>> 
>>>>>> 
>>>>>> My point was that the Metadata in itself is incorrect after merging and that is worth fixing regardless of whether the DWARF backend happens to produce the right output.
>>>>>> 
>>>>>> a.cpp
>>>>>> namespace { int a; }
>>>>>> b.cpp
>>>>>> namespace { int b;}
>>>>>> 
>>>>>> a.ll
>>>>>> !0 = !DINamespace(scope: null)
>>>>>> !1 = !DILocalVariable(name: "a", scope: !0)
>>>>>> 
>>>>>> b.ll
>>>>>> !0 = !DINamespace(scope: null)
>>>>>> !1 = !DILocalVariable(name: "b", scope: !0)
>>>>>> 
>>>>>> a+b.ll # after llvm-link
>>>>>> !0 = !DINamespace(scope: null)
>>>>>> !1 = !DILocalVariable(name: "a", scope: !0)
>>>>>> !2 = !DILocalVariable(name: "b", scope: !0)
>>>>>> 
>>>>>> In my opinion it is incorrect that "a" and "b"  share the same "scope:" field (because they don't), regardless of whether it matters for the DWARF output. What do you think?
>>>>>> 
>>>>>> Yep - I think this is about thinking about the mechanics of all this in a different way.
>>>>>> 
>>>>>> The way I'm considering to think about this is rather than as the DINamespace being a unique object that represents the entity - you could picture this more the same way DIFile is used - as a thing to reduce the memory usage by deduplicating the information - not because there's some logical entity that is being represented by that node.
>>>>>> 
>>>>>> 
>>>>>> Put another way - if these nodes were distinct, and an entity (say a type) referencing an anonymous namespace was emitted into the same CU, what would that do? Perhaps ideally LLVM would end up producing two anonymous namespaces in a single CU - would that help compared to putting them in the same anonymous namespace? I'm not sure any debugger would do anything especially 'better' with this output.
>>>>> 
>>>>> I just verified that it is not necessary to make anonymous namespaces distinct to prevent uniquing of any of their children: All DIType descendants are either distinct (such as the tag types when their UID is empty) or safe to unique (basic types, pointers, vectors, ...). All function definitions are distinct anyway. All global variables are distinct, and associated with their containing CU via the globals list.
>>>>> 
>>>>> I also verified that my above example, even though there is only a single anonymous DINamespace, is compiled into two DW_TAG_namespaces, one per CU.
>>>>> 
>>>>> So this is really just a matter of aesthetics.
>>>>> Making the nodes unique is slightly more memory efficient.
>>>>> 
>>>>> Having confirmed that everything still works (be design, not by accident), I guess I can be convinced to use unique nodes everywhere.
>>>>> Duncan, what do you think?
>>>> 
>>>> The uniquing tables are fairly expensive in compile time (particularly when reading bitcode), so I'd still tend toward using 'distinct' where merging isn't necessary for correctness or memory usage.
>>> 
>>> We would have at most one anonymous namespace per CU: Uniquing wouldn't add much overhead, but we also wouldn't save a whole lot of memory.
>>> 
>>> I'm going to follow David's suggestion, I like that it will make the code (slightly) simpler, which is a good tie-breaker for me.
>> 
>> Sure thing; just make sure it's well documented somewhere why the namespaces get merged in IR (since from a C++ semantics perspective, it's not an intuitive choice).
>> 
>>> 
>>>> 
>>>> But I'm fine either way.
>>>> 
>>>>> -- adrian
>>>>>>  
>>>>>> 
>>>>>> -- adrian
>>>>>> 
>>>>>>> I'd love to see an example of what the distinctness of these nodes achieves. (arguably distinct nodes are cheaper on LTO merging, if I recall correctly? Is that the case/am I remembering correctly? So maybe it's worth doing for that reason, but I'm not sure - it would mean more nodes/bigger metadata, though probably not by much)
>>>>>>>  
>>>>>>> > - Which tests cover the LLVM codegen side of this change - several tests no longer look for the line/file, but do any tests ensure it is not present? (if this patch were reverted, which tests would fail (not a perfect question, since the change in the metadata format would cause parse failures - but not talking about those))
>>>>>>> 
>>>>>>> test/DebugInfo/X86/dwarf-public-names.ll
>>>>>>> will fail, but not for a good reason: It hardcodes the size of the CU :-)
>>>>>>> I'll add CHECK-NOTs to one of the tests!
>>>>>>> 
>>>>>>> Thanks!
>>>>>>>  
>>>>>>> 
>>>>>>> 
>>>>>>> https://reviews.llvm.org/D32648 <https://reviews.llvm.org/D32648>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20170428/a46382f9/attachment-0001.html>