[PATCH] D109972: Regenerate LC_CODE_SIGNATURE during llvm-objcopy operations

Tue Sep 21 18:07:04 PDT 2021

alexander-shaposhnikov added a comment.

I think there are a few considerations here, so to help unblock this effort below I'll try to share my understanding/perspective.

1/ Regarding the code placement - if currently there is no good place for it with a deep sigh I'd be okay to have a version of it in objcopy. This duplication is not ideal, to me it somehow feels  ok to have it in some library 
(e.g. ArchiveWriter is in libObject and it's used in multiple places), but I would recommend not to place it into the existing headers that essentially document the binary format and are not directly related to the new functionality.
But if there is no good place / or there are strong objections - okay, it's important to keep the public interfaces in a good shape.

2/ The blob ("signature" or "signature stab" (not sure which name is better here)) itself doesn't appear to be parsed anywhere, so, perhaps, at the moment no new parsing code is necessary in libObject.
If I understand correctly what this diff and the previous one try to accomplish - they try to add the functionality to generate this blob. I think having it in-tree has some benefits 
(though as @mtrent has pointed out it's fragile), the main one - it allows the tools to be hosted pretty much anywhere (e.g. on Windows or Linux)
(similarly to other utilities, e.g. llvm-objdump, llvm-readobj, etc), the downside is that it can break if/when the format changes (but if I understand correctly we are kind of already living with it in LLD).

3/ After looking at the implementation the main concerns (mentioned above) are the following: from the readability perspective it seems to be suboptimal to mix together input and output
and use a "hybrid" approach to parameters passing. 
It's also not obvious from the declaration that the buffer (passed to CodeSignatureSection::write(...)) **must** contain "a partially constructed  mach-o object", that's why the interface looked kinda complicated
and anyone reading this code will have to dig into the actual implementation details.
The main work is done by the method CodeSignatureSection::write(...), so it seemed natural to have a function that either simply returns the blob (preferred way), or, if it's performance-critical (need numbers)
writes the content in-place, but I'd try to make the signature more intuitive (e.g. it should be clear to the reader what the input/output are). 
If we end up introducing a class it would be good to clean up the interface (the same considerations as above), 
hide the implementation details and probably use an appropriate name (e.g. CodeSignatureBuilder or <your suggestion is here>).

P.S. unlike StringTableBuilder here the whole thing is constructed in a single step, that's why my first impression was that a simpler solution should suffice, but I do not insist on it.
P.P.S. the current code makes sense in LLD's context, but the situation changes once we step away from it.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D109972/new/

https://reviews.llvm.org/D109972