[llvm-dev] Weak symbol/alias semantics

Xinliang David Li via llvm-dev llvm-dev at lists.llvm.org
Fri Jan 13 18:55:50 PST 2017


On Fri, Jan 13, 2017 at 4:53 PM, Teresa Johnson <tejohnson at google.com>
wrote:

> Thanks, David and Peter. Some responses to Peter's email below. Teresa
>
> On Fri, Jan 13, 2017 at 3:21 PM, Peter Collingbourne <peter at pcc.me.uk>
> wrote:
>
>> Hi Teresa,
>>
>> I think that to answer your question correctly it is helpful to consider
>> what is going on at the object file level. For your test1.c we conceptually
>> have a .text section containing the body of f, and then three symbols:
>>
>> .weak f
>> f = .text
>> .globl strongalias
>> strongalias = .text
>> .weak weakalias
>> weakalias = .text
>>
>> Note that f, strongalias and weakalias are not related at all, except
>> that they happen to point to the same place. If f is overridden by a symbol
>> in another object file, it does not affect the symbols strongalias and
>> weakalias, so we still need to make them point to .text. I don't think it
>> would be right to make strongalias and weakalias into copies of f, as that
>> would be observable through function pointer equality. Most likely all you
>> need to do is to internalize f and keep strongalias and weakalias as
>> aliases of f.
>>
>
> Good point on wanting function pointer equality.  However, we can't simply
> internalize f(). We'll also need to rename the internalized copy. The
> reason is that we want the original f() references to resolve to the
> prevailing copy in the other module.Summarizing what we just talked about
> on IRC, when we have a non-prevailing weak/linkonce symbol f() that has an
> alias point to it,
>

except for non-prevailing weak aliases.



> we need to:
> 1) Rename and internalize f()
> 2) Create a new external decl f()
> 3) RAUW existing references (other than from the aliases) with the new
> local created in 1)
>
>
Should be be 'with new external decl f in 2) ' ?



> I think if it is however a weak_odr/linkonce_odr we can simplify the
> process since all copies will be the same. We can make f()
> available_externally (to enable inlining), and simply convert references to
> aliases of f() into direct references to f() and drop the aliases - does
> that sound right?
>


Sounds right.


>
>  Another tricky thing is if the weak symbol was a variable that is
> initialized via a __cxx_global_var_init function in the global_ctors list.
> If we have an alias to that symbol, presumably we'll want the new
> internalized/renamed version to get initialized instead?
>
>
If the initializer references the aliased symbol, then yes. If the original
weak symbol is referenced, I don't see why the prevailing one should not be
used.


>
> Now in the case where we have an alias that is itself a weak
> non-prevailing symbol, how we handle will I think depend on what it is
> aliased to:
> a) aliased to a weak/linkonce non-prevailing symbol -> handle as described
> earlier
> b) aliased to a weak_odr/linkonce_odr non-prevailing symbol -> handle as
> described earlier
> c) aliased to a strong symbol or a prevailing symbol -> convert to
> external decl (I think this case is only possible if the alias is a non-odr
> weak/linkonce)
>
> Does that sound right?
>
>
non-prevailing weak aliases can probably be safely discarded. The
prevailing symbol may or may not be an alias itself.


David


>
>> If we're resolving strongalias to f at -O2, that seems like a bug to me.
>> We can probably only resolve an alias to the symbol it references if we are
>> guaranteed that both symbols will have the same resolution, i.e. we must
>> check at least that both symbols have strong or internal linkage. If we
>> cared about symbol interposition, we might also want to check that both
>> symbols have non-default visibility, but I think that our support for that
>> is still a little fuzzy at the moment.
>>
>
> Per your and David's analysis it sounds like this is a bug then - I can
> file a bug to track it with the example.
>
> Regarding the comdat case I mentioned - Peter and I discussed on IRC and
> he pointed out that my case was illegal since aliases are by definition in
> the same comdat group as the symbol they alias. So in effect I had an
> incomplete comdat group.
>



> Thanks,
> Teresa
>
>
>> Thanks,
>> Peter
>>
>> On Fri, Jan 13, 2017 at 2:33 PM, Teresa Johnson <tejohnson at google.com>
>> wrote:
>>
>>> Hi Mehdi, Peter and David (and anyone else who sees this),
>>>
>>> I've been playing with some examples to handle the weak symbol cases we
>>> discussed in IRC earlier this week in the context of D28523. I was going to
>>> implement the support for turning aliases into copies in order to enable
>>> performing thinLTOResolveWeakForLinkerGUID on both aliases and
>>> aliasees, as a first step to being able to drop non-prevailing weak symbols
>>> in ThinLTO backends.
>>>
>>> I was wondering though what happens if we have an alias, which may or
>>> may not be weak itself, to a non-odr weak symbol that isn't prevailing. In
>>> that case, do we eventually want references via the alias to go to the
>>> prevailing copy (in another module), or to the original copy in the alias's
>>> module? I looked at some examples without ThinLTO, and am a little
>>> confused. Current (non-ThinLTO) behavior in some cases seems to depend on
>>> opt level.
>>>
>>> Example:
>>>
>>> $ cat weak12main.c
>>> extern void test2();
>>> int main() {
>>>   test2();
>>> }
>>>
>>> $ cat weak1.c
>>> #include <stdio.h>
>>>
>>> void weakalias() __attribute__((weak, alias ("f")));
>>> void strongalias() __attribute__((alias ("f")));
>>>
>>> void f () __attribute__ ((weak));
>>> void f()
>>> {
>>>   printf("In weak1.c:f\n");
>>> }
>>> void test1() {
>>>   printf("Call f() from weak1.c:\n");
>>>   f();
>>>   printf("Call weakalias() from weak1.c:\n");
>>>   weakalias();
>>>   printf("Call strongalias() from weak1.c:\n");
>>>   strongalias();
>>> }
>>>
>>> $ cat weak2.c
>>> #include <stdio.h>
>>>
>>> void f () __attribute__ ((weak));
>>> void f()
>>> {
>>>   printf("In weak2.c:f\n");
>>> }
>>> extern void test1();
>>> void test2()
>>> {
>>>   test1();
>>>   printf("Call f() from weak2.c\n");
>>>   f();
>>> }
>>>
>>> If I link weak1.c before weak2.c, nothing is surprising (we always
>>> invoke weak1.c:f at both -O0 and -O2):
>>>
>>> $ clang weak12main.c weak1.c weak2.c -O0
>>> $ a.out
>>> Call f() from weak1.c:
>>> In weak1.c:f
>>> Call weakalias() from weak1.c:
>>> In weak1.c:f
>>> Call strongalias() from weak1.c:
>>> In weak1.c:f
>>> Call f() from weak2.c
>>> In weak1.c:f
>>>
>>> $ clang weak12main.c weak1.c weak2.c -O2
>>> $ a.out
>>> Call f() from weak1.c:
>>> In weak1.c:f
>>> Call weakalias() from weak1.c:
>>> In weak1.c:f
>>> Call strongalias() from weak1.c:
>>> In weak1.c:f
>>> Call f() from weak2.c
>>> In weak1.c:f
>>>
>>> If I instead link weak2.c first, so it's copy of f() is prevailing, I
>>> still get weak1.c:f for the call via weakalias() (both opt levels), and for
>>> strongalias() when building at -O0. At -O2 the compiler replaces the call
>>> to strongalias() with a call to f(), so it get's the weak2 copy in that
>>> case.
>>>
>>> $ clang weak12main.c weak2.c weak1.c -O2
>>> $ a.out
>>> Call f() from weak1.c:
>>> In weak2.c:f
>>> Call weakalias() from weak1.c:
>>> In weak1.c:f
>>> Call strongalias() from weak1.c:
>>> In weak2.c:f
>>> Call f() from weak2.c
>>> In weak2.c:f
>>>
>>> $ clang weak12main.c weak2.c weak1.c -O0
>>> $ a.out
>>> Call f() from weak1.c:
>>> In weak2.c:f
>>> Call weakalias() from weak1.c:
>>> In weak1.c:f
>>> Call strongalias() from weak1.c:
>>> In weak1.c:f
>>> Call f() from weak2.c
>>> In weak2.c:f
>>>
>>> I'm wondering what the expected/correct behavior is? Depending on what
>>> is correct, we need to handle this differently in ThinLTO mode. Let's say
>>> weak1.c's copy of f() is not prevailing and I am going to drop it (it needs
>>> to be removed completely, not turned into available_externally to ensure it
>>> isn't inlined since weak isInterposable). If we want the aliases in weak1.c
>>> to reference the original version, then copying is correct (e.g. weakalias
>>> and strong alias would each become a copy of weak1.c's f()). If we however
>>> want them to resolve to the prevailing copy of f(), then we need to turn
>>> the aliases into declarations (external linkage in the case of strongalias
>>> and external weak in the case of weakalias?).
>>>
>>> I also tried the case where f() was in a comdat, because I also need to
>>> handle that case in ThinLTO (when f() is not prevailing, drop it from the
>>> comdat and remove the comdat from that module). Interestingly, in this case
>>> when weak2.c is prevailing, I get the following warning when linking and
>>> get a seg fault at runtime:
>>>
>>> weak1.o:weak1.o:function test1: warning: relocation refers to discarded
>>> section
>>>
>>> Presumably the aliases still refer to the copy in weak1.c, which is in
>>> the comdat that gets dropped by the linker. So is it not legal to have an
>>> alias to a weak symbol in a comdat (i.e. alias from outside the comdat)? We
>>> don't complain in the compiler.
>>>
>>> Thanks,
>>> Teresa
>>> --
>>> Teresa Johnson |  Software Engineer |  tejohnson at google.com |
>>> 408-460-2413 <(408)%20460-2413>
>>>
>>
>>
>>
>> --
>> --
>> Peter
>>
>
>
>
> --
> Teresa Johnson |  Software Engineer |  tejohnson at google.com |
> 408-460-2413 <(408)%20460-2413>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20170113/ce7b2b93/attachment.html>


More information about the llvm-dev mailing list