[PATCH] D55881: [llvm-objcopy] [COFF] Add support for removing symbols

James Henderson via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Wed Jan 2 03:38:48 PST 2019


jhenderson added inline comments.


================
Comment at: test/tools/llvm-objcopy/COFF/X86/remove-local-symbols.s:20-21
+
+# SYMBOLS-POST: 00000000 T mainfunc
+# SYMBOLS-POST: 00000000 d ptr
+# SYMBOLS-POST-EMPTY:
----------------
This will produce a false positive, if otherfunc is present, because you've got nothing checking the total number of symbols. I don't know if COFF has a convenient way of testing this, but I would recommend doing so. Otherwise, you should put a -NEXT suffix on this line (and similarly in the above block) - you should also show that there are no symbols before this list if doing things this way, since llvm-objcopy might have reordered the symbols to put otherfunc first instead of removing it.

Could we also test that two different local symbols are removed? That shows that we remove all locals not just the first.


================
Comment at: test/tools/llvm-objcopy/COFF/X86/remove-local-symbols.s:1
+# RUN: llvm-mc -triple=x86_64-windows-gnu %s -filetype=obj -o %t.o
+
----------------
mstorsjo wrote:
> alexshap wrote:
> > so I'm  confused - what decision has been made regarding tests (sorry, many people have commented, I'd like to understand the logic): 
> > all the new tests will use assembly  while yaml is deprecated, we will use both yaml and assembly, we will use yaml and assembly only in exceptional cases, or what ?
> > As far as I'm concerned - i want consistency + functionality, at least in the future. 
> I'm not aware of any other decision other than preferring yaml over binary object files.
> 
> I'd strongly prefer to be able to use both yaml and assembly for these tests; yaml is ideal for dumped output from a full toolchain, including debug info and everything that a real object file would contain. Assembly is better for the cases when I want to construct a specific setup to test certain cases, where this keeps the test much smaller and easier to read and understand. And for executables instead of object files, yaml is the only option as we can't rely on lld.
Yes, to my knowledge, we haven't agreed anything formal re. YAML versus assembly, but I personally would prefer YAML wherever possible. From experience, at least for ELF, assembly often doesn't provide enough low-level control which we need to produce objects that have the properties we want in the input (e.g. what kinds of relocations get emitted etc). The comment about calling `otherfunc` not resulting in a relocation is a perfect example of this. It also removes the need for the "PRE" checks.

Also, I might be wrong, but wouldn't making this use YAML allow us to drop the X86-requirement here?


================
Comment at: test/tools/llvm-objcopy/COFF/X86/remove-symbol1.s:22
+# SYMBOLS-POST: 00000000 T mainfunc
+# SYMBOLS-POST-NOT: otherfunc
+# SYMBOLS-POST: 0000000d T otherfunc2
----------------
My comments apply from the remove-local-symbols test apply equally well here. We need to more tightly show that there are no symbols in other places. Similarly, it might be nice to make this YAML based if possible.


================
Comment at: test/tools/llvm-objcopy/COFF/X86/remove-symbol2.s:6-10
+# SYMBOLS-PRE: 00000000 T mainfunc
+# SYMBOLS-PRE: 0000000c T otherfunc
+# SYMBOLS-PRE: 0000000d T otherfunc2
+# SYMBOLS-PRE: 00000000 d ptr
+# SYMBOLS-PRE-EMPTY:
----------------
I don't think this check really gives us anything in this case.


================
Comment at: test/tools/llvm-objcopy/COFF/X86/remove-symbol2.s:12
+
+# ERROR: Can't remove referenced symbol ptr
+
----------------
It would be nice if this test checked that the file name appears in the error. The symbol name should probably also be quoted in some fashion. At least on ELF, it is possible to have symbol names with spaces in, so the error can be a bit ambiguous!


================
Comment at: tools/llvm-objcopy/COFF/COFFObjcopy.cpp:38
+  Obj.removeSymbols([&](const Symbol &Sym) {
+    if (is_contained(Config.SymbolsToKeep, Sym.Name))
+      return false;
----------------
There seems to be an awful lot of code in this function that I don't think is tested. Perhaps it would make more sense to implement it in more patches (e.g. basic explicit symbol removal, then keep, then strip all or whatever etc), and test each part individually.


================
Comment at: tools/llvm-objcopy/COFF/COFFObjcopy.cpp:52
+                                                Sym.Name,
+                                            object_error::parse_failed));
+      return true;
----------------
This isn't a parsing failure. Use a more appropriate error value, like `invalid_argument`.


================
Comment at: tools/llvm-objcopy/COFF/Object.cpp:38-39
+Error Object::markSymbols() {
+  for (Symbol &Sym : Symbols)
+    Sym.Referenced = false;
+  for (const Section &Sec : Sections) {
----------------
Referenced is initialised to false. Do you need to reset it here as well? I don't envisage a situation where we should need to do this more than once on a single object.


================
Comment at: tools/llvm-objcopy/COFF/Object.h:27-28
+struct Relocation {
+  Relocation() {}
+  Relocation(const object::coff_relocation &R) : Reloc(R) {}
+
----------------
Are these constructors really necessary? Won't they be generated by default?


================
Comment at: tools/llvm-objcopy/COFF/Object.h:46-47
   ArrayRef<uint8_t> AuxData;
+  size_t Key;
+  size_t RawIdx;
+  bool Referenced = false;
----------------
Don't abbreviate unnecessarily (Idx -> Index). Do we really need both separately? If so, perhaps better names would be `OriginalIndex` and `Index` or similar, and then use the input symbol index as the key.


================
Comment at: tools/llvm-objcopy/COFF/Object.h:70
+  // Update SymbolMap and RawIdx in each Symbol. This must be called
+  // after updating the Symbols vector from outside of this class.
+  void updateSymbols();
----------------
This comment seems to indicate an encapsulation issue to me. If this function has to be called any time the vector is updated, perhaps the vector should be private and accessed only through member functions?


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D55881/new/

https://reviews.llvm.org/D55881





More information about the llvm-commits mailing list