[llvm] [llvm-objcopy] Segment based binary output format for objcopy (PR #68569)

Peter Smith via llvm-commits llvm-commits at lists.llvm.org
Fri Oct 13 08:45:17 PDT 2023


https://github.com/smithp35 commented:

I'm personally not keen on having a llvm-fromelf alias for a few reasons, some might be resolved but some I don't think can be resolved.

The first is a comparison with llvm-readelf. The llvm-readelf tries to be a drop in replacement for readelf and tries to follow the same syntax for all options. This can't be done for fromelf in total as fromelf is a mix of objcopy, objdump, dwarfdump, strip et-al. Unless there is a commitment to mimic fromelf I think this sets expectations that cannot be matched. Even if we only mimic the binary/hex writing parts there are a lot more options that fromelf has and behaviour that are not implemented.

The second, and something that would need to be checked, is whether closely using the name of a commercial tool may bring down legal/trademark problems. Possible that there is no problem, but I would personally not want to take the risk.

Lastly I think the options can be added to llvm-objcopy with descriptive names and developers could easily find them, or be told that these were functionally equivalent to the fromelf options.

With that in mind it looks like the options that have been added are a new output format -O sbin.

If it is possible I personally would not add this as a separate format, but as an additional option modifying -O bin. In other words the output is still bin, it is just that the tool splits up the output into multiple parts, much like --only-section <section>. This can be combined with other output formats like hex without having to additional segement based outputs for all the other hex formats. It looks like in the original proposal it was --dump-segments.

It may be best to go back to discourse with the proposed interface as it will be easier to have a conversation with multiple people there. This is just my own opinion and I'm not a maintainer.

Some other thoughts:
* The fromelf interface https://developer.arm.com/documentation/101754/0621/fromelf-Reference/fromelf-Command-line-Options/--bin?lang=en will output a single file with the desired output filename if there is only one segment, but a directory with the desired output filename containing the segments otherwise. Fromelf uses a convention that looks up the name of the first output section in the segment, but as mentioned on discourse this is something we should change to something like segment1 segment2 etc. as existing fromelf customers get confused with this convention. I can't see tests for both of these cases. 
* How does this combine with --only-section=<section list>? In theory this is well defined and only the sections in the list could be written. Can we have tests added showing the interaction, in particular what happens when some segements are partially written and some not written at all.
* Is there scope for --only-segment=<segment list>? Could be added later though.

In summary:
* I'd prefer that the llvm-fromelf alias is dropped, or at least renamed not to say fromelf.
* Would a modifier like --dump-segments be preferable to a new output format? That would combine with hex output formats as well, which can also be written out per segment. That would require more tests to show that it worked with hex too.
* It looks like some more test cases are required.

I've not gone through the code in detail.

https://github.com/llvm/llvm-project/pull/68569


More information about the llvm-commits mailing list