[llvm-dev] Weak symbol/alias semantics
Teresa Johnson via llvm-dev
llvm-dev at lists.llvm.org
Wed Jan 18 08:03:42 PST 2017
On Wed, Jan 18, 2017 at 7:40 AM, Teresa Johnson <tejohnson at google.com>
wrote:
>
>
> On Wed, Jan 18, 2017 at 7:35 AM, Mehdi Amini <mehdi.amini at apple.com>
> wrote:
>
>>
>>
>> On Jan 18, 2017, at 7:11 AM, Teresa Johnson <tejohnson at google.com> wrote:
>>
>>
>>
>> On Sat, Jan 14, 2017 at 12:15 PM, Mehdi Amini <mehdi.amini at apple.com>
>> wrote:
>>
>>>
>>> On Jan 14, 2017, at 8:14 AM, Teresa Johnson <tejohnson at google.com>
>>> wrote:
>>>
>>>
>>>
>>> On Fri, Jan 13, 2017 at 9:54 PM, Mehdi Amini <mehdi.amini at apple.com>
>>> wrote:
>>>
>>>> Hi,
>>>>
>>>> I believe we should get it right (and simpler) if (when…) we move to
>>>> the representation we discussed last spring: https://llvm.org/bugs/
>>>> show_bug.cgi?id=27866
>>>>
>>>> Our conclusion was that we should always have alias pointing to private
>>>> anonymous and nothing else, so when we currently have:
>>>>
>>>> @a = global i32 0
>>>> @b = alias @a
>>>>
>>>> It should always become:
>>>>
>>>> @0 = private i32 0
>>>> @a = alias @0
>>>> @b = alias @0
>>>>
>>>
>>> Yes that has some nice properties. I think it makes at least one case
>>> harder though.
>>>
>>> Consider:
>>>
>>> 1) Original:
>>> define linkonce_odr void @a() { ... }
>>> @b = linkonce_odr alias void (), void ()* @a
>>>
>>> 2) In the proposed canonical form this becomes:
>>> define private void @0() { ... }
>>> @a = linkonce_odr alias void (), void ()* @0
>>> @b = linkonce_odr alias void (), void ()* @0
>>>
>>> And let's say @a and @b are both non-prevailing in this module. I think
>>> we lose inlining ability for these functions when we start dropping
>>> non-prevailing copies.
>>>
>>> In 1) we could transform this (as described in my earlier email) to:
>>>
>>> define available_externally void @a() { ... }
>>> /* RAUW @b with @a */
>>>
>>> and we can inline @a into all callsites (which includes calls original
>>> via @b), before dropping it.
>>>
>>> With 2), I think we would need to transform this to:
>>>
>>> define private void @0() { ... }
>>> declare void @a()
>>> declare void @b()
>>>
>>> and we no longer can inline calls to @a and @b. I suppose we could
>>> intervene and special case this scenario so that it is transformed as for
>>> 1) when there are non-prevailing odr aliases.
>>>
>>>
>>> The complication is that a cloneFunction() step would be needed right?
>>>
>>
>> I hadn't thought about that level of implementation detail, but in any
>> case a cloneFunction step will be needed with both the current alias format
>> and with your proposed new form (just in different cases though). My point
>> was that the proposed format doesn't always make the handling simpler.
>>
>>
>>> Alternatively I can imagine us allowing available_externally alias, I’m
>>> not sure why they are forbidden today other than because we map the
>>> object-level limitations in the IR, but I can miss something (we don’t have
>>> alias on Darwin either…).
>>>
>>
>> Allowing available_externally alias helps simplify the case where we have
>> a non-prevailing weak alias. We would still need to address the case where
>> the aliasee is non-prevailing, which I think is the trickier situation to
>> handle.
>>
>>
>> I'm missing something, aren't we talking about the proposal where all
>> aliasee are always private?
>>
>
> Oh I see. Yes, that is the part simplified by the proposed new alias
> handling. I misunderstood what your "Alternatively" was referring to.
>
> I'm a little wary of making fixing this issue for ThinLTO dependent on
> changing the alias representation in the compiler. Maybe it wouldn't be
> that difficult to do though.
>
I suppose I could do this by transforming aliases to the proposed canonical
form just before we do the weak resolution in the ThinLTO backends. Longer
term, how would such a change be implemented? I.e. would the current IR
continue to be accepted (and generated by clang), and the transformation
occur within LLVM early in the pipeline?
Teresa
> Teresa
>
>
>> Mehdi
>>
>>
>>>
>>>
>>> I need to think through the various permutations of linkage types and
>>> see how they get transformed in the current and proposed canonical forms to
>>> see if there are other cases that need special handling.
>>>
>>> Teresa
>>>
>>>
>>>
>>>>
>>>> —
>>>> Mehdi
>>>>
>>>>
>>>>
>>>> On Jan 13, 2017, at 3:21 PM, Peter Collingbourne <peter at pcc.me.uk>
>>>> wrote:
>>>>
>>>> Hi Teresa,
>>>>
>>>> I think that to answer your question correctly it is helpful to
>>>> consider what is going on at the object file level. For your test1.c we
>>>> conceptually have a .text section containing the body of f, and then three
>>>> symbols:
>>>>
>>>> .weak f
>>>> f = .text
>>>> .globl strongalias
>>>> strongalias = .text
>>>> .weak weakalias
>>>> weakalias = .text
>>>>
>>>> Note that f, strongalias and weakalias are not related at all, except
>>>> that they happen to point to the same place. If f is overridden by a symbol
>>>> in another object file, it does not affect the symbols strongalias and
>>>> weakalias, so we still need to make them point to .text. I don't think it
>>>> would be right to make strongalias and weakalias into copies of f, as that
>>>> would be observable through function pointer equality. Most likely all you
>>>> need to do is to internalize f and keep strongalias and weakalias as
>>>> aliases of f.
>>>>
>>>> If we're resolving strongalias to f at -O2, that seems like a bug to
>>>> me. We can probably only resolve an alias to the symbol it references if we
>>>> are guaranteed that both symbols will have the same resolution, i.e. we
>>>> must check at least that both symbols have strong or internal linkage. If
>>>> we cared about symbol interposition, we might also want to check that both
>>>> symbols have non-default visibility, but I think that our support for that
>>>> is still a little fuzzy at the moment.
>>>>
>>>> Thanks,
>>>> Peter
>>>>
>>>> On Fri, Jan 13, 2017 at 2:33 PM, Teresa Johnson <tejohnson at google.com>
>>>> wrote:
>>>>
>>>>> Hi Mehdi, Peter and David (and anyone else who sees this),
>>>>>
>>>>> I've been playing with some examples to handle the weak symbol cases
>>>>> we discussed in IRC earlier this week in the context of D28523. I was going
>>>>> to implement the support for turning aliases into copies in order to enable
>>>>> performing thinLTOResolveWeakForLinkerGUID on both aliases and
>>>>> aliasees, as a first step to being able to drop non-prevailing weak symbols
>>>>> in ThinLTO backends.
>>>>>
>>>>> I was wondering though what happens if we have an alias, which may or
>>>>> may not be weak itself, to a non-odr weak symbol that isn't prevailing. In
>>>>> that case, do we eventually want references via the alias to go to the
>>>>> prevailing copy (in another module), or to the original copy in the alias's
>>>>> module? I looked at some examples without ThinLTO, and am a little
>>>>> confused. Current (non-ThinLTO) behavior in some cases seems to depend on
>>>>> opt level.
>>>>>
>>>>> Example:
>>>>>
>>>>> $ cat weak12main.c
>>>>> extern void test2();
>>>>> int main() {
>>>>> test2();
>>>>> }
>>>>>
>>>>> $ cat weak1.c
>>>>> #include <stdio.h>
>>>>>
>>>>> void weakalias() __attribute__((weak, alias ("f")));
>>>>> void strongalias() __attribute__((alias ("f")));
>>>>>
>>>>> void f () __attribute__ ((weak));
>>>>> void f()
>>>>> {
>>>>> printf("In weak1.c:f\n");
>>>>> }
>>>>> void test1() {
>>>>> printf("Call f() from weak1.c:\n");
>>>>> f();
>>>>> printf("Call weakalias() from weak1.c:\n");
>>>>> weakalias();
>>>>> printf("Call strongalias() from weak1.c:\n");
>>>>> strongalias();
>>>>> }
>>>>>
>>>>> $ cat weak2.c
>>>>> #include <stdio.h>
>>>>>
>>>>> void f () __attribute__ ((weak));
>>>>> void f()
>>>>> {
>>>>> printf("In weak2.c:f\n");
>>>>> }
>>>>> extern void test1();
>>>>> void test2()
>>>>> {
>>>>> test1();
>>>>> printf("Call f() from weak2.c\n");
>>>>> f();
>>>>> }
>>>>>
>>>>> If I link weak1.c before weak2.c, nothing is surprising (we always
>>>>> invoke weak1.c:f at both -O0 and -O2):
>>>>>
>>>>> $ clang weak12main.c weak1.c weak2.c -O0
>>>>> $ a.out
>>>>> Call f() from weak1.c:
>>>>> In weak1.c:f
>>>>> Call weakalias() from weak1.c:
>>>>> In weak1.c:f
>>>>> Call strongalias() from weak1.c:
>>>>> In weak1.c:f
>>>>> Call f() from weak2.c
>>>>> In weak1.c:f
>>>>>
>>>>> $ clang weak12main.c weak1.c weak2.c -O2
>>>>> $ a.out
>>>>> Call f() from weak1.c:
>>>>> In weak1.c:f
>>>>> Call weakalias() from weak1.c:
>>>>> In weak1.c:f
>>>>> Call strongalias() from weak1.c:
>>>>> In weak1.c:f
>>>>> Call f() from weak2.c
>>>>> In weak1.c:f
>>>>>
>>>>> If I instead link weak2.c first, so it's copy of f() is prevailing, I
>>>>> still get weak1.c:f for the call via weakalias() (both opt levels), and for
>>>>> strongalias() when building at -O0. At -O2 the compiler replaces the call
>>>>> to strongalias() with a call to f(), so it get's the weak2 copy in that
>>>>> case.
>>>>>
>>>>> $ clang weak12main.c weak2.c weak1.c -O2
>>>>> $ a.out
>>>>> Call f() from weak1.c:
>>>>> In weak2.c:f
>>>>> Call weakalias() from weak1.c:
>>>>> In weak1.c:f
>>>>> Call strongalias() from weak1.c:
>>>>> In weak2.c:f
>>>>> Call f() from weak2.c
>>>>> In weak2.c:f
>>>>>
>>>>> $ clang weak12main.c weak2.c weak1.c -O0
>>>>> $ a.out
>>>>> Call f() from weak1.c:
>>>>> In weak2.c:f
>>>>> Call weakalias() from weak1.c:
>>>>> In weak1.c:f
>>>>> Call strongalias() from weak1.c:
>>>>> In weak1.c:f
>>>>> Call f() from weak2.c
>>>>> In weak2.c:f
>>>>>
>>>>> I'm wondering what the expected/correct behavior is? Depending on what
>>>>> is correct, we need to handle this differently in ThinLTO mode. Let's say
>>>>> weak1.c's copy of f() is not prevailing and I am going to drop it (it needs
>>>>> to be removed completely, not turned into available_externally to ensure it
>>>>> isn't inlined since weak isInterposable). If we want the aliases in weak1.c
>>>>> to reference the original version, then copying is correct (e.g. weakalias
>>>>> and strong alias would each become a copy of weak1.c's f()). If we however
>>>>> want them to resolve to the prevailing copy of f(), then we need to turn
>>>>> the aliases into declarations (external linkage in the case of strongalias
>>>>> and external weak in the case of weakalias?).
>>>>>
>>>>> I also tried the case where f() was in a comdat, because I also need
>>>>> to handle that case in ThinLTO (when f() is not prevailing, drop it from
>>>>> the comdat and remove the comdat from that module). Interestingly, in this
>>>>> case when weak2.c is prevailing, I get the following warning when linking
>>>>> and get a seg fault at runtime:
>>>>>
>>>>> weak1.o:weak1.o:function test1: warning: relocation refers to
>>>>> discarded section
>>>>>
>>>>> Presumably the aliases still refer to the copy in weak1.c, which is in
>>>>> the comdat that gets dropped by the linker. So is it not legal to have an
>>>>> alias to a weak symbol in a comdat (i.e. alias from outside the comdat)? We
>>>>> don't complain in the compiler.
>>>>>
>>>>> Thanks,
>>>>> Teresa
>>>>> --
>>>>> Teresa Johnson | Software Engineer | tejohnson at google.com |
>>>>> 408-460-2413 <(408)%20460-2413>
>>>>>
>>>>
>>>>
>>>>
>>>> --
>>>> --
>>>> Peter
>>>>
>>>>
>>>>
>>>
>>>
>>> --
>>> Teresa Johnson | Software Engineer | tejohnson at google.com |
>>> 408-460-2413 <(408)%20460-2413>
>>>
>>>
>>>
>>
>>
>> --
>> Teresa Johnson | Software Engineer | tejohnson at google.com |
>> 408-460-2413 <(408)%20460-2413>
>>
>>
>
>
> --
> Teresa Johnson | Software Engineer | tejohnson at google.com |
> 408-460-2413 <(408)%20460-2413>
>
--
Teresa Johnson | Software Engineer | tejohnson at google.com | 408-460-2413
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20170118/4aa96afd/attachment.html>
More information about the llvm-dev
mailing list