[PATCH] D38790: [ELF] Do not keep symbols if they referenced only from discarded sections.

Igor Kudrin via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Thu Nov 16 03:41:01 PST 2017


ikudrin added inline comments.


================
Comment at: ELF/MarkLive.cpp:65
+  // For non-weak DSO symbols, mark the file to be added to a DT_NEEDED entry.
+  if (S->isWeak())
+    return;
----------------
ruiu wrote:
> Why do you have to exclude weak symbols? Please expand (or replace) the comment to explain why this code should be like this.
What about this?
```
// Weak symbols should not cause adding DT_NEEDED.
```
Anyway, it's mostly a replacement for the code removed from `SymbolTable::addShared()`.


================
Comment at: ELF/MarkLive.cpp:281
+    for (InputFile *F : ObjectFiles)
+      for (Symbol *S : cast<ObjFile<ELFT>>(F)->getLocalSymbols())
+        markSymbolLive<ELFT>(S);
----------------
ruiu wrote:
> Can't you just use getSymbols instead of getLocalSymbols?
Not all symbols come from the object files. For example, in `test/ELF/wrap.s`, symbol `__real_foo` would disappear if we didn't process `Symtab->getSymbols()`. So, we have to process symbols from `Symtab`. Additionally, local symbols from object files should also be processed, but we don't need to scan global symbols again because they are already processed at the first step.


================
Comment at: ELF/MarkLive.cpp:310-312
+  // Set Live flag for all symbols which survived.
+  // We already marked some of them in resolveReloc(), but if a symbol just
+  // points to a section and is not used in any relocation we haven't seen it.
----------------
ruiu wrote:
> Why do you need to keep such symbols? It seems like practically you can just ignore such symbols.
It's a replacement for the removed logic from `includeInSymtab()`. Now we have `Live` flag set for all existing symbols after GC phase.

There might be several cases when a symbol references an alive section, but we haven't seen it in relocations. For example:
```
$ cat a.s
.global _start, foo, bar
.text
_start:
	bl foo
	bl baz
.section .text.foo,"ax"
foo:
	nop
bar:
	nop
baz:
	nop
$ as a.s -o a.o
$ readelf -r a.o
Relocation section '.rela.text' at offset 0x178 contains 2 entries:
  Offset          Info           Type           Sym. Value    Sym. Name + Addend
000000000000  00090000011b R_AARCH64_CALL26  0000000000000000 foo + 0
000000000004  00060000011b R_AARCH64_CALL26  0000000000000000 .text.foo + 8
$ ld.lld --gc-sections a.o -o a.out
```

The question is, should symbols `bar` and `baz` be kept or not? As you can see, `bar` points to some location inside the section, but is not involved in any relocation. For `baz`, things are a bit different; it is used in the code but in the corresponding relocation a section symbol is used instead.

The current behavior of `lld` and other linkers (`ld` and `gold`) is consistent, as they all keep such symbols. I believe that, in any case, this patch should not change that because it is intended to solve another issue.


https://reviews.llvm.org/D38790





More information about the llvm-commits mailing list