[PATCH] D55881: [llvm-objcopy] [COFF] Add support for removing symbols

Jake Ehrlich via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Thu Dec 20 16:05:14 PST 2018


jakehehrlich marked 2 inline comments as done.
jakehehrlich added a comment.

Head's up I'm about to go home for the evening and I'm pretty well off for until the new year after that. I think others are in a similar boat. I'm not ready to LGTM this yet and I don't have time to give more review so this might have to wait until January 2nd.

In D55881#1337277 <https://reviews.llvm.org/D55881#1337277>, @mstorsjo wrote:

> I found them rather entangled, as removing sections also requires removing the corresponding symbols for those sections. But I can try to make this only removal of symbols, as an absolutely minimal first step.


Yeah I think given how simple section stripping seems to be for COFF relative to ELF this might make sense. I'm fine with just minimizing the number of options a bit. I think we can come back to minimizing handleArgs after the new years



================
Comment at: tools/llvm-objcopy/COFF/Object.cpp:19
+
+void Object::updateSymbols() {
+  SymbolMap.clear();
----------------
I was thinking of updateSymbols as more of an operation the user could take on each symbol. See updateSymbols in the ELF code for instance. This seems like something that can be done at finalization time.


================
Comment at: tools/llvm-objcopy/COFF/Object.cpp:35
+  Sections.erase(
+      std::remove_if(std::begin(Sections), std::end(Sections),
+                     [ToRemove](const Section &Sec) { return ToRemove(Sec); }),
----------------
mstorsjo wrote:
> jakehehrlich wrote:
> > Can sections reference each other? Can a symbol reference a section? A significant portion of the complexity in removeSection in ELF comes from the fact that we handle those sorts of cases. If you remove a section then depending on circumstance we either remove the referencing object as well or throw an error because you weren't *also* removing the reference.
> A symbol references a section, and a relocation can reference a symbol. There are cases where sections can reference each other (associative comdat symbols) that I haven't implemented yet - I don't think GNU objcopy does either (it doesn't know about the associative comdat concept at all, afaik). Getting those right would of course be ideal, but they're a bit out of scope of what I'm aiming at initially (basic stripping functionality).
> 
> Most of the functionality in this patch also revolves around this - removing symbols that belong to a removed section, etc, but it's rather spread out, and the Object class API isn't easy to use for constructing things from scratch, and updates requires coordinating a few steps.
Awesome! Good to hear COFF doesn't (at least yet) have that complexity.


================
Comment at: tools/llvm-objcopy/COFF/Object.cpp:48
+
+void Object::setReferencedSymbols(StringRef Filename) {
+  // This function is meant to be used before pruning any symbols.
----------------
mstorsjo wrote:
> jakehehrlich wrote:
> > Again don't follow the error handling that I used in ELF code. You don't need to pass the otherwise useless Filename if you return an error because you can add that information when you report the error in whatever handles this. Also sometimes it won't even be a file (see -I binary for instance)
> Ok, that makes sense. I didn't copy this design from the ELF code actually, I independently made the same bad design...
I copied it from LLD so you're not alone.


================
Comment at: tools/llvm-objcopy/COFF/Object.h:44
+  // for actually writing back to a file.
+  bool Referenced = false;
 };
----------------
mstorsjo wrote:
> jakehehrlich wrote:
> > Yeah this was the sort of thing we eventually compromised on but we agreed that it was a result of poor structure in the ELF code. Ideally this sort of thing should not be in Symbol; we just didn't find a way around it that didn't require significant upheaval in the code. Since we're in an earlier stage with COFF code we can find a better way that storing this with the Symbol itself perhaps?
> I don't instinctively think of a better way of storing it - can you elaborate on what you would have in mind?
I don't have anything specifically in mind, I don't remember the review too well. It's kind of formats specific. For instance a symbol can be "referenced" by more than one section. It might be nice to have sections list all the symbols that they reference (except the symbol table which is the owner not the referencer) but this could cause O(n^2) running times. I was hoping that some quirk or insight you might have into the nature of COFF might lead us toward a means of avoiding storing weather this was referenced or not while maintaining something about as efficient. Had I done a complete redesign of the ELF code with this in mind we might wound up with something like a graph that could be walked to take out all the things that we actually wind up using and stripping would just reduce the set of initial nodes for instance; That's a crazy different direction but maybe something less extreme is possible here.


================
Comment at: tools/llvm-objcopy/COFF/Object.h:72
+  void initRawSymbolTable();
+  void setReferencedSymbols(StringRef Filename);
+
----------------
jakehehrlich wrote:
> If possible its most ideal to keep this true as you go rather than setting it all at once. We found that wasn't easily possible to do correctly in the ELF code 1) That might not be true here but even if it is 2) A more generic interface to this would be to use updateSymbols which is the sort of generic operation on an Object that I'd like to see used in favor of something specific like this.
Oh other thing, I think we did this on the section level (which probably doesn't make sense here) but it was called "markSymbols" to mimic "mark and sweep" garbage collection.  I'd prefer that name but it hardly matters.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D55881/new/

https://reviews.llvm.org/D55881





More information about the llvm-commits mailing list