[cfe-dev] Mangling & linkage of unnamed but instantiated nested classes

David Blaikie dblaikie at gmail.com
Tue Nov 13 17:53:45 PST 2012


On Tue, Nov 13, 2012 at 5:40 PM, Eli Friedman <eli.friedman at gmail.com> wrote:
> On Tue, Nov 13, 2012 at 5:19 PM, David Blaikie <dblaikie at gmail.com> wrote:
>> +patch
>>
>> On Tue, Nov 13, 2012 at 5:19 PM, David Blaikie <dblaikie at gmail.com> wrote:
>>> On Fri, Nov 9, 2012 at 5:38 PM, Eli Friedman <eli.friedman at gmail.com> wrote:
>>>> Comments below.
>>>>
>>>> On Fri, Nov 9, 2012 at 5:25 PM, David Blaikie <dblaikie at gmail.com> wrote:
>>>>> On Fri, Nov 9, 2012 at 5:25 PM, David Blaikie <dblaikie at gmail.com> wrote:
>>>>>> On Mon, Nov 5, 2012 at 12:29 AM, Eli Friedman <eli.friedman at gmail.com> wrote:
>>>>>>> On Sun, Nov 4, 2012 at 9:20 PM, David Blaikie <dblaikie at gmail.com> wrote:
>>>>>>>> On Wed, Oct 31, 2012 at 2:19 PM, Eli Friedman <eli.friedman at gmail.com> wrote:
>>>>>>>>> On Wed, Oct 31, 2012 at 12:37 PM, David Blaikie <dblaikie at gmail.com> wrote:
>>>>>>>>>> Hi John, (& cfe-dev)
>>>>>>>>>>
>>>>>>>>>> I mentioned this in person last week & wanted to provide you with some
>>>>>>>>>> more details & ask for your opinion.
>>>>>>>>>>
>>>>>>>>>> Backstory:
>>>>>>>>>>
>>>>>>>>>> I originally came across this while trying to use
>>>>>>>>>> -Wunused-member-function & found it was flagging code like this:
>>>>>>>>>>
>>>>>>>>>> struct foo {
>>>>>>>>>>   struct {
>>>>>>>>>>     void func() { ... } // warning that this is 'unused'
>>>>>>>>>>   } x;
>>>>>>>>>> };
>>>>>>>>>>
>>>>>>>>>> This surprised me, so I looked around & found that this function has
>>>>>>>>>> "no linkage". This seemed strange (because I would expect to be able
>>>>>>>>>> to call the function from multiple TUs that included the header
>>>>>>>>>> defining 'foo', compare its address for equality, & such things) & it
>>>>>>>>>> looks like the Right Thing is for func() to have the same linkage as,
>>>>>>>>>> say, an inline member function of 'foo' would have.
>>>>>>>>>>
>>>>>>>>>> The standard (3.5[basic.link]) seems to "miss" this case depending on
>>>>>>>>>> how you read it:
>>>>>>>>>>
>>>>>>>>>> 1) arguably p2, which says that "A name is said to have linkage when
>>>>>>>>>> it might denote the same ... function ... as a name introduced by a
>>>>>>>>>> declaration in another scope: - When a name has external linkage, the
>>>>>>>>>> entity it denotes can be referred to by names from scopes of other
>>>>>>>>>> translation units or from other scopes of the same translation unit"
>>>>>>>>>>
>>>>>>>>>> 2) p5: "... a member function ... has external linkage if the name of
>>>>>>>>>> the class has external linkage" (with an exception only for unnamed
>>>>>>>>>> classes (& enumerations) defined in class-scope typedef declarations
>>>>>>>>>> such that the class or enumeration has the dypedef name for linkage
>>>>>>>>>> purposes (7.1.3)) & there are no rules that seem to govern the linkage
>>>>>>>>>> of this unnamed class.
>>>>>>>>>>     p8 "Names not covered by these rules have no linkage"
>>>>>>>>>
>>>>>>>>> [basic.link]p8: "A type is said to have linkage if and only if [...]
>>>>>>>>> it is an unnamed class or enumeration member of a class with linkage".
>>>>>>>>>
>>>>>>>>> I'm going to assume it's an oversight in the standard that the members
>>>>>>>>> of such an unnamed class don't have linkage.
>>>>>>>>>
>>>>>>>>>> The next step was also to look at the mangling compared to GCC. After
>>>>>>>>>> modifying the linkage of functions like this (with this change):
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> --- lib/AST/Decl.cpp
>>>>>>>>>> +++ lib/AST/Decl.cpp
>>>>>>>>>> @@ -496,8 +496,7 @@ static LinkageInfo getLVForClassMember(const
>>>>>>>>>> NamedDecl *D, bool OnlyTemplate) {
>>>>>>>>>>    if (!(isa<CXXMethodDecl>(D) ||
>>>>>>>>>>          isa<VarDecl>(D) ||
>>>>>>>>>>          isa<FieldDecl>(D) ||
>>>>>>>>>> -        (isa<TagDecl>(D) &&
>>>>>>>>>> -         (D->getDeclName() || cast<TagDecl>(D)->getTypedefNameForAnonDecl()))))
>>>>>>>>>> +        isa<TagDecl>(D)))
>>>>>>>>>>      return LinkageInfo::none();
>>>>>>>>>>
>>>>>>>>>>    LinkageInfo LV;
>>>>>>>>>>
>>>>>>>>>> (& Richard reckons we might be able to simplify that check - just to
>>>>>>>>>> eliminate some template cases that still seem to get filtered out by
>>>>>>>>>> it)
>>>>>>>>>
>>>>>>>>> This looks like a step in the right direction.
>>>>>>>>>
>>>>>>>>>> I caused one test to fail:
>>>>>>>>>> test/CodeGenCXX/template-anonymous-types.cpp. A modified version (to
>>>>>>>>>> better investigate GCC 4.7's mangling) looks like this:
>>>>>>>>>>
>>>>>>>>>> struct S {
>>>>>>>>>>   enum { FOO = 42 };
>>>>>>>>>>   enum { BAR = 42 };
>>>>>>>>>> };
>>>>>>>>>>
>>>>>>>>>> struct T {
>>>>>>>>>>   enum { FOO = 42 };
>>>>>>>>>>   enum { BAR = 42 };
>>>>>>>>>> };
>>>>>>>>>>
>>>>>>>>>> template <typename T> struct X {
>>>>>>>>>>   T value;
>>>>>>>>>>   X(T t) : value(t) {}
>>>>>>>>>>   int f() { return value; }
>>>>>>>>>> };
>>>>>>>>>>
>>>>>>>>>> template <typename T> int f(T t) {
>>>>>>>>>>   X<T> x(t);
>>>>>>>>>>   return x.f();
>>>>>>>>>> }
>>>>>>>>>>
>>>>>>>>>> void test() {
>>>>>>>>>>   (void)f(S::FOO);
>>>>>>>>>>   (void)f(S::BAR);
>>>>>>>>>>   (void)f(T::FOO);
>>>>>>>>>>   (void)f(T::BAR);
>>>>>>>>>> }
>>>>>>>>>>
>>>>>>>>>> & with my change we get the right linkage for the instantiations of
>>>>>>>>>> 'f' (linkonce_odr instead of internal) but the mangling is still
>>>>>>>>>> inconsistent with GCC at least. Clang mangles these 4 'f's as:
>>>>>>>>>>
>>>>>>>>>> _Z1fIN1S3$_0EEiT_
>>>>>>>>>> _Z1fIN1S3$_1EEiT_
>>>>>>>>>> _Z1fIN1T3$_2EEiT_
>>>>>>>>>> _Z1fIN1T3$_3EEiT_
>>>>>>>>>>
>>>>>>>>>> GCC 4.7 mangles them as:
>>>>>>>>>>
>>>>>>>>>> _Z1fIN1SUt_EEiT_
>>>>>>>>>> _Z1fIN1SUt0_EEiT_
>>>>>>>>>> _Z1fIN1TUt_EEiT_
>>>>>>>>>> _Z1fIN1TUt0_EEiT_
>>>>>>>>>>
>>>>>>>>>> I don't think we have any class-specific unnamed nested type counter
>>>>>>>>>> that would implement that Ut_, Ut0_, ... mangling scheme, though I can
>>>>>>>>>> imagine where one might be added (I'm not very familiar with IRGen
>>>>>>>>>> though, so I'll certainly be happy to have any pointers about how that
>>>>>>>>>> could/should be implemented).
>>>>>>>>>
>>>>>>>>> We have a similar scheme already implemented for lambdas; look at
>>>>>>>>> getLambdaManglingNumber() etc.
>>>>>>>>
>>>>>>>> Thanks for the pointer. Following the lines there I've had a first
>>>>>>>> blush at this (see attached).
>>>>>>>>
>>>>>>>> I've not addressed some of the issues Richard brought up but I'd be
>>>>>>>> happy to take the time to generalize this to handle.
>>>>>>
>>>>>> Spent a little time trying to generalize this to handle the function
>>>>>> local case but it was taking a bit too much of my time so I figured
>>>>>> I'd at least get this case handled (which addresses my issue of
>>>>>> -Wunused-function & generally improves the world a little bit at
>>>>>> least).
>>>>>>
>>>>>>> +  if (!Tag->getName().empty() || Tag->getTypedefNameForAnonDecl() ||
>>>>>>> +      !isa<CXXRecordDecl>(Tag->getParent()))
>>>>>>> +    return -2;
>>>>>>> +
>>>>>>> +  std::pair<llvm::DenseMap<const DeclContext *, int>::iterator, bool>
>>>>>>> P = UnnamedMangleContexts.insert(std::make_pair(Tag->getParent(),
>>>>>>> -1));
>>>>>>>
>>>>>>> getParent() doesn't return a canonical DeclContext.
>>>>>>
>>>>>> I'm not entirely sure what bugs this might cause (test cases welcome)
>>>>>> but I believe I've addressed this in the newly attached patch.
>>>>
>>>> It might not matter here, because all decls contained in a RecordDecl
>>>> generally have the same parent... this has a bigger impact for
>>>> NamespaceDecl, for example, because there are two different Decls
>>>> which are semantically the same context.
>>>
>>> Thanks for the explanation.
>>>
>>>>
>>>>>>>  Also, 80 columns.
>>>>>>
>>>>>> Right, sorry, I was mostly looking to see if I was generally on the
>>>>>> right track. Fixed, though.
>>>>>>
>>>>>>>  Also, we probably don't want to generate numbers for a decl unless
>>>>>>> the mangling is actually externally visible.
>>>>>>
>>>>>> Done.
>>>>>>
>>>>>>>
>>>>>>> --- include/clang/AST/Decl.h
>>>>>>> +++ include/clang/AST/Decl.h
>>>>>>> @@ -2486,6 +2486,8 @@ private:
>>>>>>>    /// otherwise, it is a null (TypedefNameDecl) pointer.
>>>>>>>    llvm::PointerUnion<TypedefNameDecl*, ExtInfo*> TypedefNameDeclOrQualifier;
>>>>>>>
>>>>>>> +  int UnnamedManglingNumber;
>>>>>>> +
>>>>>>>    bool hasExtInfo() const { return TypedefNameDeclOrQualifier.is<ExtInfo*>(); }
>>>>>>>    ExtInfo *getExtInfo() { return TypedefNameDeclOrQualifier.get<ExtInfo*>(); }
>>>>>>>    const ExtInfo *getExtInfo() const {
>>>>>>>
>>>>>>> We don't really want to add an extra member to DeclContext just to
>>>>>>> handle an obscure edge case.
>>>>>>
>>>>>> Used a separate side table in the ASTContext to keep the numbers. (& I
>>>>>> changed the numbers to be simpler - counting from zero instead of -1,
>>>>>> then doing an offset to compute the actual mangle number. Let me know
>>>>>> if you'd prefer it to work the other way)
>>>>>
>>>>> (now with the actually up-to-date patch)
>>>>
>>>> @@ -2582,8 +2581,9 @@ void
>>>> TagDecl::setTypedefNameForAnonDecl(TypedefNameDecl *TDD) {
>>>>  void TagDecl::startDefinition() {
>>>>    IsBeingDefined = true;
>>>>
>>>> -  if (isa<CXXRecordDecl>(this)) {
>>>> -    CXXRecordDecl *D = cast<CXXRecordDecl>(this);
>>>> +  getASTContext().addUnnamedTag(this);
>>>>
>>>> Will getTypedefNameForAnonDecl actually return the correct result
>>>> here?
>>>
>>> Seems not - thanks for the catch.
>>>
>>>> (I'd like to see a test to make sure that "typedef struct {} x"
>>>> doesn't cause us to increment the mangling number, assuming that's
>>>> consistent with gcc.)
>>>
>>> Test case added & then I had to restructure the code to address this.
>>> Since we don't know which way a type will be mangled until after we've
>>> parsed all the declarators in the group (eg: typedef struct{} *x, *y,
>>> z; - only when we see 'z' do we know that we have a linkage name for
>>> the struct, if we only have x and y, then we don't have such a name &
>>> we need to use the UtX_ naming) I had to inject the logic into
>>> FinalizeDeclaratorGroup (or somewhere near there - perhaps there's a
>>> better spot). This means we miss the original test case failure of:
>>> struct foo { enum { X }; }; since there is no declarator group in tht
>>> instance, just a standalone decl. So I added this to the
>>> ParsedFreeStandingDeclSpec as well.
>>>
>>> If there's some more appropriate common point to put this logic, I'm
>>> all ears - but it seems to me there's no common point in Sema for
>>> these two cases so it's either duplicated or could be non-duplicated
>>> if it were up in the Parser (or had a new Sema entry point to call
>>> from there).
>>>
>>> Open to ideas - thanks again for the help/review/pointers.
>
> That seems reasonable.
>
> +void ASTContext::addUnnamedTag(const TagDecl *Tag) {
> +  if (!Tag->getName().empty() || Tag->getTypedefNameForAnonDecl() ||
> +      !isa<CXXRecordDecl>(Tag->getParent()) || Tag->getLinkage() !=
> ExternalLinkage)
> +    return;
>
> Why do we need to enforce that the parent is a CXXRecordDecl? I think
> you can run into the same issue in other situations, no?  (Or were you
> intentionally putting that off to another patch?  I forget.)

I played around with expanding this to the function-local case but had
a few problems. Figured it would be best to separate that out for a
subsequent commit.

>
> +  std::pair<llvm::DenseMap<const DeclContext *, unsigned>::iterator,
> bool> P = UnnamedMangleContexts.insert(std::make_pair(Tag->getParent(),
> 0));
>
> 80 columns.

Hrm, sorry - not sure how that snuck back in. Thanks for the catch.

> I don't see any other issues.

Thanks for the help/review. Committed in r167906.

- David



More information about the cfe-dev mailing list