[llvm-dev] (no subject)

Eric Astor via llvm-dev llvm-dev at lists.llvm.org
Fri Sep 25 04:59:02 PDT 2020


Thanks, Martin!

My biggest question is around the behavior for alias-to-alias linkage.
Using Microsoft tools (ml64.exe), if you define an external symbol t2,
alias t4 to t2, and alias t7 to t4, you get exactly what you asked for:
[ 8](sec  1)(fl 0x00)(ty  20)(scl   2) (nx 0) 0x00000001 t2
[ 9](sec  0)(fl 0x00)(ty   0)(scl  69) (nx 1) 0x00000001 t4
AUX indx 8 srch 3
[11](sec  0)(fl 0x00)(ty   0)(scl  69) (nx 1) 0x00000001 t7
AUX indx 9 srch 3

Using LLVM, we instead get a second weak default-null reference pointing
directly to t2, rather than to t4:
[ 3](sec  1)(fl 0x00)(ty  20)(scl   2) (nx 0) 0x00000001 t2
...
[ 7](sec  0)(fl 0x00)(ty   0)(scl  69) (nx 1) 0x00000000 t4
AUX indx 9 srch 3
[ 9](sec  1)(fl 0x00)(ty   0)(scl  69) (nx 0) 0x00000001 .weak.t4.default.t1
...
[17](sec  0)(fl 0x00)(ty   0)(scl  69) (nx 1) 0x00000000 t7
AUX indx 19 srch 3
[19](sec  1)(fl 0x00)(ty   0)(scl  69) (nx 0) 0x00000001 .weak.t7.default.t1

Due to our creation of ".weak" intermediates duplicating the current
resolution of the aliasee, I think this can result in a different
resolution for t7 than would happen in the Microsoft tools case? (Say, in a
context where t4 has a strong definition.)

Maybe we should eliminate the ".weak" intermediates if the reference's
target is already an external symbol? They seem unnecessary for that case.

Thanks,
- Eric

On Thu, Sep 24, 2020 at 3:49 AM Martin Storsjö <martin at martin.st> wrote:

> Hi,
>
> On Wed, 23 Sep 2020, Eric Astor via llvm-dev wrote:
>
> > While working on alias support for the LLVM-ML project, I ran into a
> feature
> > implemented back in 2010: default-null weak externals in COFF, a GNU
> > extension.
> > https://reviews.llvm.org/rG17990d56907b
> > I'd like to disable this feature when targeting MSVC compatibility. Does
> > anyone have more context on this, and why it'd be a terrible idea?
> >
> > For context: This seems to be designed to let LLVM implement a GNU
> extension
> > in COFF libraries. However, it leads to very different behavior than we
> see
> > for cl.exe (and ml.exe) on Windows; for already-defined aliasees, it
> injects
> > an alternate placeholder ".weak.<alias>.default.<uniquifier>" symbol
> which
> > resolves back to the current location. I admit, I'm not quite sure how
> this
> > helps. If anyone can explain the purpose, I'd really appreciate it!
>
> So, for the GNU extension, from the user point of view, there's two
> potential usecases.
>
> A translation unit can reference a function declaration with
> __attribute__((weak)), with no implementation in the translation unit.
> This then then either evaluates to NULL or an actual implementation, if
> there existed another, non-weak definition in another object file at
> link time.
>
> Secondly, multiple translation units may have function definitions that
> are marked with the weak attribute. You can have this in 0-N object files,
> and 0-1 object files containing a non-weak definition. If there's no
> non-weak definition, one of the weak definitions ends up picked, but if
> there is one, the non-weak one ends up used.
>
> As all this is consumed via GNU style attributes (in MinGW environments),
> it shouldn't really matter in an MSVC context.
>
> I recently worked on this to get the final details on this hooked up for
> COFF, so I'd be happy to have a look at any work touching this feature.
>
> > In Windows PE/COFF files, aliases typically just resolve to their target
> > symbol. For an example, see
> https://reviews.llvm.org/D87403#inline-811289.
>
> For the cases where there already exists a symbol with a name that is
> unique in itself, just adding an alias directly to the target symbol
> sounds sensible in itself, but for cases when it isn't set up as an alias,
> but where the implementation itself is marked weak, the uniquifying symbol
> name is needed, to allow multiple objects to provide the same thing.
>
> Consider these two examples in GAS assembly form:
>
>          .globl uniquename
> uniquename:
>          ret
>
>          .globl func
> func:
>          ret
>
>          .weak aliasname
> aliasname = func
>
> This produces the following symbols, shown with llvm-objdump -t:
>
> [ 6](sec  1)(fl 0x00)(ty   0)(scl   2) (nx 0) 0x00000000 uniquename
> [ 7](sec  1)(fl 0x00)(ty   0)(scl   2) (nx 0) 0x00000001 func
> [ 8](sec  0)(fl 0x00)(ty   0)(scl  69) (nx 1) 0x00000000 aliasname
> AUX indx 10 srch 3        [pointing at .weak.aliasname.default.uniquename]
> [10](sec  1)(fl 0x00)(ty   0)(scl   2) (nx 0) 0x00000001
> .weak.aliasname.default.uniquename
>
> So here .weak.aliasname.default.uniquename is identical to func, and as
> func itself is non-weak, aliasname could just as well have pointed
> directly at func instead.
>
>
> But for this case, the extra dance is necessary:
>
>          .globl uniquename
> uniquename:
>          ret
>
>          .weak func
>          .globl func
> func:
>          ret
>
> Producing:
> [ 6](sec  1)(fl 0x00)(ty   0)(scl   2) (nx 0) 0x00000000 uniquename
> [ 7](sec  0)(fl 0x00)(ty   0)(scl  69) (nx 1) 0x00000000 func
> AUX indx 9 srch 3
> [ 9](sec  1)(fl 0x00)(ty   0)(scl   2) (nx 0) 0x00000001
> .weak.func.default.uniquename
>
>
>
> Initially, the non-weak symbols were just named ".weak.func.default", but
> this caused clashes if multiple object files defined the same one. I tried
> fixing this in https://reviews.llvm.org/D71711 by making the non-weak
> symbols that the weak ones point at static, but MSVC tools error out if
> you have a weak symbol pointing at a non-external symbol (as "weak" in
> COFF actually is "weak external"). Therefore I reverted that attempt and I
> later made https://reviews.llvm.org/D75989 that tries to make unique
> names
> for these symbols, to avoid clashes.
>
> // Martin
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20200925/35ba74ec/attachment.html>


More information about the llvm-dev mailing list