[llvm-dev] GC for defsym'd symbols in LLD

Fāng-ruì Sòng via llvm-dev llvm-dev at lists.llvm.org
Tue Dec 3 23:05:02 PST 2019


On Tue, Dec 3, 2019 at 7:02 PM Shoaib Meenai via llvm-dev
<llvm-dev at lists.llvm.org> wrote:
>
> LLD treats any symbol referenced from a linker script as a GC root, which makes sense. Unfortunately, it also processes --defsym as a linker script fragment internally, so all target symbols of a --defsym also get treated as GC roots (i.e., if you have something like --defsym SRC=TGT, TGT will become a GC root). I believe this to be unnecessary for defsym specifically, since you're just aliasing a symbol, and if the original or aliased symbols are referenced from anywhere, the symbol's section will get preserved anyway. (There's also cases where the defsym target can be an expression instead of just a symbol name, which I admittedly haven't thought about too hard, but I believe the same logic  should hold in terms of any needed sections getting preserved regardless.) I want to change defsym targets specifically to not be considered as GC roots, so that they can be dead code eliminated. Does anyone foresee any issues with this?

% cat a.s
.globl _start, foo, bar
.text; _start: movabs $d, %rax
.section .text_foo,"ax"; foo: ret
.section .text_bar,"ax"; bar: nop
% as a.s -o a.o

% ld.bfd a.o --defsym d=foo --gc-sections -o a => .text_foo is retained
% ld.bfd a.o --defsym d=bar --gc-sections -o a => .text_bar is retained
% ld.bfd a.o --defsym d=1 --gc-sections -o a => Neither .text_foo nor
.text_bar is retained
% ld.bfd a.o --defsym c=foo --defsym d=1 --gc-sections -o a => Neither
.text_foo nor .text_bar is retained; lld will retain .text_foo.

For --defsym from=an_expression_with_to, GNU ld appears to add a
reference from 'from' to 'to'. lld's behavior
(https://reviews.llvm.org/D34195) is more conservative.

If we stop treating script->referencedSymbols as GC roots,
instructions like `movabs $d, %rax` will no longer be able to access
the intended section. We can tweak our behavior to be like GNU ld, but
the additional complexity may not be worthwhile.


More information about the llvm-dev mailing list