[PATCH] D78628: [obj2yaml] - Program headers: simplify the computation of p_filesz.

Fangrui Song via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Thu Apr 23 09:43:31 PDT 2020


MaskRay added inline comments.


================
Comment at: llvm/lib/ObjectYAML/ELFEmitter.cpp:778-780
+      uint64_t FileSize = Fragments.back().Offset - PHeader.p_offset;
+      if (Fragments.back().Type != llvm::ELF::SHT_NOBITS)
+        FileSize += Fragments.back().Size;
----------------
grimar wrote:
> grimar wrote:
> > jhenderson wrote:
> > > This is a little subtle, and probably deserves a comment to explain why we pay attention to offset but not size of a single trailing SHT_NOBITS sections.
> > I was about to place the following comment:
> > 
> > ```
> > // SHT_NOBITS sections occupy no physical space in a file, we should not
> > // take their sizes into account when calculating the file size of a
> > // segment.
> > ```
> > 
> > But then remembered one thing and started to doubt about the usage of its offset.
> > It was mentioned in D69192 that the sh_offset of a SHT_NOBITS can be
> > "larger than the file size with some usage of objcopy".
> > 
> > I was unable to find a proper test case committed for this and I do not think I know a way to achieve it.
> > (`elf-disassemble-bss.test` (https://github.com/llvm/llvm-project/blob/master/llvm/test/tools/llvm-objdump/X86/elf-disassemble-bss.test) has a non-trailing .bss section that has a too large sh_offset though).
> > 
> > 
> > ELF spec says about sh_offset:
> > 
> > "This member's value gives the byte offset from the beginning of the file to
> > the first byte in the section. One section type, SHT_NOBITS described
> > below, occupies no space in the file, and its sh_offset member locates
> > the conceptual placement in the file."
> > 
> > Does "sh_offset member locates the conceptual placement in the file" imply that the
> > `sh_offset` have to be less than the file size? I am not sure that "in the file." == "can be outside of the file", it is wierd.
> > 
> > So can we use `sh_offset` like I do here?
> FTR, the approach of this patch it is the same as what LLD does:
> 
> ```
> template <class ELFT> void Writer<ELFT>::setPhdrs(Partition &part) {
>   for (PhdrEntry *p : part.phdrs) {
>     OutputSection *first = p->firstSec;
>     OutputSection *last = p->lastSec;
> 
>     if (first) {
>       p->p_filesz = last->offset - first->offset;
>       if (last->type != SHT_NOBITS)
>         p->p_filesz += last->size;
> ...
> ```
llvm-objcopy cannot make sh_offset larger than the file size (if it does, I will assuredly consider it a bug). GNU objcopy can do some in some cases. See `binutils-gdb/bfd/elf.c:assign_file_positions_for_load_sections` the variable `off_adjust`. A few bug reports are related https://sourceware.org/bugzilla/show_bug.cgi?id=25662

This is basically a size saving optimization but this is rather error-prone. In llvm-objcopy --only-keep-debug, we simply made `sh_offset` monotonically increasing to avoid such complexity.


================
Comment at: llvm/test/tools/yaml2obj/ELF/program-header-size-offset.yaml:33
 # CHECK:    Offset: 0x2004
-# CHECK:    FileSize: 4
+# CHECK:    FileSize: 5
 # CHECK:    MemSize: 6
----------------
grimar wrote:
> MaskRay wrote:
> > Why is this changed?
> This patch fixes a bug I beliece: it's 
> 
> ```
>   # Program header with 2 SHT_NOBITS sections.
>   - Type:     0x6abcdef0
>     Offset:   0x2004
>     Sections:
>       - Section: .data
>       - Section: .nobits1
>       - Section: .nobits2
> ```
> 
> The layout is:
> 
> ```
> Section Headers:
>   [Nr] Name              Type            Address          Off    Size   ES Flg Lk Inf Al
>   [ 0]                   NULL            0000000000000000 000000 000000 00     0   0  0
>   [ 1] .text             PROGBITS        0000000000000000 001000 000004 00     0   0 4096
>   [ 2] .rodata           PROGBITS        0000000000000000 002000 000004 00     0   0 4096
>   [ 3] .data             PROGBITS        0000000000000000 002004 000004 00     0   0  0
>   [ 4] .nobits1          NOBITS          0000000000000000 002008 000001 00     0   0  0
>   [ 5] .nobits2          NOBITS          0000000000000000 002009 000001 00     0   0  0
>   [ 6] .strtab           STRTAB          0000000000000000 002008 000001 00     0   0  1
>   [ 7] .shstrtab         STRTAB          0000000000000000 002009 000039 00     0   0  1
> ```
> 
> `0x2009 - 0x2004 == 0x5`, not `0x4`
I think it is hard to say that this is a bug. Conceptually sh_offset of a SHT_NOBITS section can be ignored. Usually, the ELF writer should set the sh_offset field of `.nobits2` to 0x2008 because there is no need to leave a one-byte gap.

I don't think this trivia matter much though, handling it either way is ok. If doing it one way helps simplify our overall logic, let's choose that way.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D78628/new/

https://reviews.llvm.org/D78628





More information about the llvm-commits mailing list