[PATCH] D46896: [llvm-objcopy] Add --strip-unneeded option

Paul Semel via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Wed May 16 08:33:47 PDT 2018


paulsemel added a comment.

In https://reviews.llvm.org/D46896#1101273, @jhenderson wrote:

> I'm now getting a bit lost. Please could you outline the GNU objcopy behaviour for the following axes:


Sure, I'll try to be as much consistent as I can.

> Axis 1: elf type (ET_REL versus ET_DYN/ET_EXEC)
>  Axis 2: symbol binding (Global/Weak/Local)

There is a real thing here. The fact is that, as you already know, for ET_REL, global symbols might be link with other relocatable files. For this reason, and only for ET_REL, `objcopy` is keeping those symbols.
I will try to give you an example. Consider the test file test/tools/llvm-objcopy/localize.test. If you `yaml2obj` it, and launch `objcopy --strip-unneeded %t %t1`, you will notice that those the weak and global symbols are kept in the result binary, despite the fact there are not referenced in a relocation.
Now, if you take this same test file but change from ET_REL to ET_DYN (and do the same procedure again), you will notice that the symbols are not present anymore (symtab should be removed actually).
This is what I'm trying to reproduce with this behavior. The fact is that, if you take a look at `objcopy` code, you won't see anything about this ET_REL handling.
But the fact is that they are also handling this option in the section removal part of `objcopy`, which makes them removing symbol table when this option is set.

I didn't do it this way because I truly think that the symtab removal might be done in the writing part of `llvm-objcopy`. By this I mean that we might test whether the symtab is empty, and if so, just remove it. The second thing is that I wanted to avoid the problem mentioned by @alexshap  in an other review with the `keep-symbol` option. This way of doing, we are avoiding the problem.

To summarize: For ELF other than ET_REL, `objcopy` is basically stripping everything (except if in a reloc). For ET_REL, it tries not to leave the binary in a broken state. Indeed, if we also remove Global/Weak symbols, we are just breaking the linking part (that's not what we want).

> Axis 3: undefined/defined symbols

For other than ET_REL, `objcopy` doesn't seem to care about undefined/defined symbols. BUT, as @jakehehrlich mentioned, undefined Global/Weak symbols (again, that are not present in reloc) are stripped from the binary. This is, as far as I can tell, the only time `objcopy` strips G/W symbols in ET_REL Elf files.

> This should help us evaluate the behaviour and determine if the approach taken is correct. It should be straightforward to generate these different cases using yaml2obj, like you do in the tests.

Hope this clarifies, please tell me if you want me to elaborate on some other points :)


Repository:
  rL LLVM

https://reviews.llvm.org/D46896





More information about the llvm-commits mailing list