[PATCH] D67689: [llvm-objcopy] Add support for --gap-fill and --pad-to options

James Henderson via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Tue Sep 24 03:52:05 PDT 2019


jhenderson added a reviewer: jakehehrlich.
jhenderson added a comment.

One major question to answer is how should --gap-fill interact with the preserving of segment contents that llvm-objcopy follows when segment contents are not covered by existing sections (see llvm/test/tools/llvm-objcopy/preserve-segment-contents.test for an example of the behaviour). I'd be inclined to have --gap-fill overwrite the data in this case, in which case a test should be written to show that the old data is no longer written in this case (see below for more details).

Some other questions/comments:

1. What should be the behaviour for gaps between the start of a segment and the start of the first section? I feel like --gap-fill should fill that gap too (same goes for end of the section and segment).
2. In relation to the preserve contents issue above, what should happen if a section is removed? Currently it fills it with zeroes. I think it should instead fill it with --gap-fill bytes (see `writeSegmentData`).
3. For --gap-fill, I think a simpler approach than what you have might be to change how Segments are written in `writeSegmentData`: if --gap-fill is not specified, use the existing contents of the segment, otherwise use --gap-fill's value.
4. For --pad-to, what should happen with SHT_NOBITS sections? The description you've written suggests that they should have their size increased to pad to the corresponding address, but this isn't the behaviour I see with GNU objcopy.
5. Also for --pad-to, I feel like your code is more complicated than it needs to be. I feel like the relevant changes would be most appropriate in or immediately before layoutSegments - identify what the last segment is and then resize it and the last section in it. I don't think you need to create new sections or segments, although you'll need to identify how you can increase the size.
6. What does GNU objcopy do for non-progbits sections that are last in the loadable segment, e.g. .dynamic sections?
7. What does GNU objcopy do if the last such section is nested inside multiple segments (e.g. because it's a TLS section)? I expect it to resize all such segments, which might not be trivial to achieve using llvm-objcopy's current architecture (sections are only in the top-most parent segment, and don't know about other segments they are in).



================
Comment at: llvm/include/llvm/Object/ELF.h:408
                        " has a sh_offset (0x" + Twine::utohexstr(Offset) +
-                       ") + sh_size (0x" + Twine(Size) +
+                       ") + sh_size (0x" + Twine::utohexstr(Size) +
                        ") that cannot be represented");
----------------
Hui wrote:
> asmith wrote:
> > jhenderson wrote:
> > > This and a couple of tests look like an unrelated change?
> > This is a bug fix. The value wasn’t printed correctly.
> This diagnostic message probably emitted when the layout of resized sections was not correctly done. However it is misleading.  The 'Size' is supposed to be in hex while it is not.  
I agree it's a bug fix, but it's a bug fix not directly related to this issue. Please create a separate patch for it.


================
Comment at: llvm/test/tools/llvm-objcopy/ELF/gap-fill.test:4
+#
+# clang-7 gapfillpadto.cpp -o gapfillpadto.elf
+# void swap(int *a, int *b) {
----------------
Hui wrote:
> jhenderson wrote:
> > Hui wrote:
> > > asmith wrote:
> > > > seiya wrote:
> > > > > Is it possible to use yaml2obj instead of adding a binary file? I think we just need few sections to make sure that this patch works well. For example (not tested):
> > > > > 
> > > > > ```
> > > > > !ELF
> > > > > FileHeader:
> > > > >   Class:           ELFCLASS64
> > > > >   Data:            ELFDATA2LSB
> > > > >   Type:            ET_EXEC
> > > > >   Machine:         EM_X86_64
> > > > > Sections:
> > > > >   - Name:            .text
> > > > >     Type:            SHT_PROGBITS
> > > > >     Content:         "AABBCCDD"
> > > > >   - Name:            .foo
> > > > >     Type:            SHT_PROGBITS
> > > > >     Content:         "EEFF"
> > > > >   - Name:            .bar
> > > > >     Type:            SHT_NOBITS
> > > > > ```
> > > > That's a good suggestion. Will look into that.
> > > This patch resizes the section and its containing segments. (Parent or child segment). One main part of the test is to check the section to segment mapping from PHDR. However yam2obj can't keep program headers.  
> > > This patch resizes the section and its containing segments. (Parent or child segment). One main part of the test is to check the section to segment mapping from PHDR. However yam2obj can't keep program headers.
> > 
> > I'm not sure I understand what you mean here? yaml2obj supports program headers (it has done for a couple of years now I think). See for example llvm/test/tools/yaml2obj/program-header.yaml etc as well as various llvm-objcopy tests using program headers (e.g. llvm/test/tools/llvm-objcopy/segment-test-remove-section.test).
> You are right. Actually I was then thinking about using obj2yaml to get yaml file and operate it with yaml2obj in the test.  Unfortunately obj2yaml hasn't supported dumping the program headers (A plausible easy feature to have but not in position yet). So no output about PHDR by yam2obj.
In general, we prefer to create the YAML file by hand, as obj2yaml output for an object is significantly more verbose than is required to test the behaviour. If after looking at existing examples of how to write program headers for yaml2obj, you need more assistance in creating a test input, I'm happy to help - I've had plenty of experience crafting them to test odd cases.


================
Comment at: llvm/tools/llvm-objcopy/ObjcopyOpts.td:234
+    : Eq<"pad-to",
+      "Pad the last loadable section up to <addr> with the value specified by --gap-fill if present or zero by default.">,
+      MetaVarName<"addr">;
----------------
This line is too long and should be broken up over multiple lines, like the other switch names. I think there's a clang-format invocation to do that for tablegen files, but I'm not sure what it is.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D67689/new/

https://reviews.llvm.org/D67689





More information about the llvm-commits mailing list