[llvm-dev] Weak symbol/alias semantics

Xinliang David Li via llvm-dev llvm-dev at lists.llvm.org
Fri Jan 13 22:24:24 PST 2017


On Fri, Jan 13, 2017 at 7:58 PM, Teresa Johnson <tejohnson at google.com>
wrote:

>
>
> On Fri, Jan 13, 2017 at 6:55 PM, Xinliang David Li <davidxl at google.com>
> wrote:
>
>>
>>
>> On Fri, Jan 13, 2017 at 4:53 PM, Teresa Johnson <tejohnson at google.com>
>> wrote:
>>
>>> Thanks, David and Peter. Some responses to Peter's email below. Teresa
>>>
>>> On Fri, Jan 13, 2017 at 3:21 PM, Peter Collingbourne <peter at pcc.me.uk>
>>> wrote:
>>>
>>>> Hi Teresa,
>>>>
>>>> I think that to answer your question correctly it is helpful to
>>>> consider what is going on at the object file level. For your test1.c we
>>>> conceptually have a .text section containing the body of f, and then three
>>>> symbols:
>>>>
>>>> .weak f
>>>> f = .text
>>>> .globl strongalias
>>>> strongalias = .text
>>>> .weak weakalias
>>>> weakalias = .text
>>>>
>>>> Note that f, strongalias and weakalias are not related at all, except
>>>> that they happen to point to the same place. If f is overridden by a symbol
>>>> in another object file, it does not affect the symbols strongalias and
>>>> weakalias, so we still need to make them point to .text. I don't think it
>>>> would be right to make strongalias and weakalias into copies of f, as that
>>>> would be observable through function pointer equality. Most likely all you
>>>> need to do is to internalize f and keep strongalias and weakalias as
>>>> aliases of f.
>>>>
>>>
>>> Good point on wanting function pointer equality.  However, we can't
>>> simply internalize f(). We'll also need to rename the internalized copy.
>>> The reason is that we want the original f() references to resolve to the
>>> prevailing copy in the other module.Summarizing what we just talked about
>>> on IRC, when we have a non-prevailing weak/linkonce symbol f() that has an
>>> alias point to it,
>>>
>>
>> except for non-prevailing weak aliases.
>>
>
> Right, this is just dealing with aliased symbols, not the alias itself
> (which is discussed below).
>
>>
>>
>>
>>> we need to:
>>> 1) Rename and internalize f()
>>> 2) Create a new external decl f()
>>> 3) RAUW existing references (other than from the aliases) with the new
>>> local created in 1)
>>>
>>>
>> Should be be 'with new external decl f in 2) ' ?
>>
>
> Yep, right I switched that when I wrote it in the email!
>
>
>>
>>
>>> I think if it is however a weak_odr/linkonce_odr we can simplify the
>>> process since all copies will be the same. We can make f()
>>> available_externally (to enable inlining), and simply convert references to
>>> aliases of f() into direct references to f() and drop the aliases - does
>>> that sound right?
>>>
>>
>>
>> Sounds right.
>>
>>
>>>
>>>  Another tricky thing is if the weak symbol was a variable that is
>>> initialized via a __cxx_global_var_init function in the global_ctors list.
>>> If we have an alias to that symbol, presumably we'll want the new
>>> internalized/renamed version to get initialized instead?
>>>
>>>
>> If the initializer references the aliased symbol, then yes. If the
>> original weak symbol is referenced, I don't see why the prevailing one
>> should not be used.
>>
>
> I was thinking of the case where the aliased symbol is the original weak
> symbol. I.e. you have something like:
>
> @fv = weak global i8 0, align 8
> @strongalias = weak alias i8, i8* @fv
>
> (@fv is the aliased weak symbol). If @fv is initialized via an initializer
> in the global_ctors list, and therefore we follow the procedure described
> above and convert it to a renamed local like:
>
> @fv.llvm.1 = internal global i8 0, align 8
> @strongalias = weak alias i8, i8* @fv.llvm.1
>
> and the original converted to an external decl like:
>
> @fv = external global i8
>
> Then presumably the prevailing copy of @fv will be initialized elsewhere,
> and @fv.llvm.1 needs to be initialized her.
>


Right, the global initialization is part of the object definition.

>
>
>
>
>>
>>>
>>> Now in the case where we have an alias that is itself a weak
>>> non-prevailing symbol, how we handle will I think depend on what it is
>>> aliased to:
>>> a) aliased to a weak/linkonce non-prevailing symbol -> handle as
>>> described earlier
>>> b) aliased to a weak_odr/linkonce_odr non-prevailing symbol -> handle as
>>> described earlier
>>> c) aliased to a strong symbol or a prevailing symbol -> convert to
>>> external decl (I think this case is only possible if the alias is a non-odr
>>> weak/linkonce)
>>>
>>> Does that sound right?
>>>
>>>
>> non-prevailing weak aliases can probably be safely discarded. The
>> prevailing symbol may or may not be an alias itself.
>>
>
> Meaning just convert it to an external decl?
>

I believe so.

David

>
> Teresa
>
>
>>
>>
>> David
>>
>>
>>>
>>>> If we're resolving strongalias to f at -O2, that seems like a bug to
>>>> me. We can probably only resolve an alias to the symbol it references if we
>>>> are guaranteed that both symbols will have the same resolution, i.e. we
>>>> must check at least that both symbols have strong or internal linkage. If
>>>> we cared about symbol interposition, we might also want to check that both
>>>> symbols have non-default visibility, but I think that our support for that
>>>> is still a little fuzzy at the moment.
>>>>
>>>
>>> Per your and David's analysis it sounds like this is a bug then - I can
>>> file a bug to track it with the example.
>>>
>>> Regarding the comdat case I mentioned - Peter and I discussed on IRC and
>>> he pointed out that my case was illegal since aliases are by definition in
>>> the same comdat group as the symbol they alias. So in effect I had an
>>> incomplete comdat group.
>>>
>>
>>
>>
>>> Thanks,
>>> Teresa
>>>
>>>
>>>> Thanks,
>>>> Peter
>>>>
>>>> On Fri, Jan 13, 2017 at 2:33 PM, Teresa Johnson <tejohnson at google.com>
>>>> wrote:
>>>>
>>>>> Hi Mehdi, Peter and David (and anyone else who sees this),
>>>>>
>>>>> I've been playing with some examples to handle the weak symbol cases
>>>>> we discussed in IRC earlier this week in the context of D28523. I was going
>>>>> to implement the support for turning aliases into copies in order to enable
>>>>> performing thinLTOResolveWeakForLinkerGUID on both aliases and
>>>>> aliasees, as a first step to being able to drop non-prevailing weak symbols
>>>>> in ThinLTO backends.
>>>>>
>>>>> I was wondering though what happens if we have an alias, which may or
>>>>> may not be weak itself, to a non-odr weak symbol that isn't prevailing. In
>>>>> that case, do we eventually want references via the alias to go to the
>>>>> prevailing copy (in another module), or to the original copy in the alias's
>>>>> module? I looked at some examples without ThinLTO, and am a little
>>>>> confused. Current (non-ThinLTO) behavior in some cases seems to depend on
>>>>> opt level.
>>>>>
>>>>> Example:
>>>>>
>>>>> $ cat weak12main.c
>>>>> extern void test2();
>>>>> int main() {
>>>>>   test2();
>>>>> }
>>>>>
>>>>> $ cat weak1.c
>>>>> #include <stdio.h>
>>>>>
>>>>> void weakalias() __attribute__((weak, alias ("f")));
>>>>> void strongalias() __attribute__((alias ("f")));
>>>>>
>>>>> void f () __attribute__ ((weak));
>>>>> void f()
>>>>> {
>>>>>   printf("In weak1.c:f\n");
>>>>> }
>>>>> void test1() {
>>>>>   printf("Call f() from weak1.c:\n");
>>>>>   f();
>>>>>   printf("Call weakalias() from weak1.c:\n");
>>>>>   weakalias();
>>>>>   printf("Call strongalias() from weak1.c:\n");
>>>>>   strongalias();
>>>>> }
>>>>>
>>>>> $ cat weak2.c
>>>>> #include <stdio.h>
>>>>>
>>>>> void f () __attribute__ ((weak));
>>>>> void f()
>>>>> {
>>>>>   printf("In weak2.c:f\n");
>>>>> }
>>>>> extern void test1();
>>>>> void test2()
>>>>> {
>>>>>   test1();
>>>>>   printf("Call f() from weak2.c\n");
>>>>>   f();
>>>>> }
>>>>>
>>>>> If I link weak1.c before weak2.c, nothing is surprising (we always
>>>>> invoke weak1.c:f at both -O0 and -O2):
>>>>>
>>>>> $ clang weak12main.c weak1.c weak2.c -O0
>>>>> $ a.out
>>>>> Call f() from weak1.c:
>>>>> In weak1.c:f
>>>>> Call weakalias() from weak1.c:
>>>>> In weak1.c:f
>>>>> Call strongalias() from weak1.c:
>>>>> In weak1.c:f
>>>>> Call f() from weak2.c
>>>>> In weak1.c:f
>>>>>
>>>>> $ clang weak12main.c weak1.c weak2.c -O2
>>>>> $ a.out
>>>>> Call f() from weak1.c:
>>>>> In weak1.c:f
>>>>> Call weakalias() from weak1.c:
>>>>> In weak1.c:f
>>>>> Call strongalias() from weak1.c:
>>>>> In weak1.c:f
>>>>> Call f() from weak2.c
>>>>> In weak1.c:f
>>>>>
>>>>> If I instead link weak2.c first, so it's copy of f() is prevailing, I
>>>>> still get weak1.c:f for the call via weakalias() (both opt levels), and for
>>>>> strongalias() when building at -O0. At -O2 the compiler replaces the call
>>>>> to strongalias() with a call to f(), so it get's the weak2 copy in that
>>>>> case.
>>>>>
>>>>> $ clang weak12main.c weak2.c weak1.c -O2
>>>>> $ a.out
>>>>> Call f() from weak1.c:
>>>>> In weak2.c:f
>>>>> Call weakalias() from weak1.c:
>>>>> In weak1.c:f
>>>>> Call strongalias() from weak1.c:
>>>>> In weak2.c:f
>>>>> Call f() from weak2.c
>>>>> In weak2.c:f
>>>>>
>>>>> $ clang weak12main.c weak2.c weak1.c -O0
>>>>> $ a.out
>>>>> Call f() from weak1.c:
>>>>> In weak2.c:f
>>>>> Call weakalias() from weak1.c:
>>>>> In weak1.c:f
>>>>> Call strongalias() from weak1.c:
>>>>> In weak1.c:f
>>>>> Call f() from weak2.c
>>>>> In weak2.c:f
>>>>>
>>>>> I'm wondering what the expected/correct behavior is? Depending on what
>>>>> is correct, we need to handle this differently in ThinLTO mode. Let's say
>>>>> weak1.c's copy of f() is not prevailing and I am going to drop it (it needs
>>>>> to be removed completely, not turned into available_externally to ensure it
>>>>> isn't inlined since weak isInterposable). If we want the aliases in weak1.c
>>>>> to reference the original version, then copying is correct (e.g. weakalias
>>>>> and strong alias would each become a copy of weak1.c's f()). If we however
>>>>> want them to resolve to the prevailing copy of f(), then we need to turn
>>>>> the aliases into declarations (external linkage in the case of strongalias
>>>>> and external weak in the case of weakalias?).
>>>>>
>>>>> I also tried the case where f() was in a comdat, because I also need
>>>>> to handle that case in ThinLTO (when f() is not prevailing, drop it from
>>>>> the comdat and remove the comdat from that module). Interestingly, in this
>>>>> case when weak2.c is prevailing, I get the following warning when linking
>>>>> and get a seg fault at runtime:
>>>>>
>>>>> weak1.o:weak1.o:function test1: warning: relocation refers to
>>>>> discarded section
>>>>>
>>>>> Presumably the aliases still refer to the copy in weak1.c, which is in
>>>>> the comdat that gets dropped by the linker. So is it not legal to have an
>>>>> alias to a weak symbol in a comdat (i.e. alias from outside the comdat)? We
>>>>> don't complain in the compiler.
>>>>>
>>>>> Thanks,
>>>>> Teresa
>>>>> --
>>>>> Teresa Johnson |  Software Engineer |  tejohnson at google.com |
>>>>> 408-460-2413 <(408)%20460-2413>
>>>>>
>>>>
>>>>
>>>>
>>>> --
>>>> --
>>>> Peter
>>>>
>>>
>>>
>>>
>>> --
>>> Teresa Johnson |  Software Engineer |  tejohnson at google.com |
>>> 408-460-2413 <(408)%20460-2413>
>>>
>>
>>
>
>
> --
> Teresa Johnson |  Software Engineer |  tejohnson at google.com |
> 408-460-2413 <(408)%20460-2413>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20170113/9ee2a1d4/attachment.html>


More information about the llvm-dev mailing list