[PATCH] D67325: [ELF] Map the ELF header at imageBase

Wed Sep 11 08:24:12 PDT 2019

peter.smith added a comment.

Apologies for the delay in responding. I've gone through in a bit more detail. Just to make sure I understand and hopefully give an alternative explanation to anyone else following:

- Create an example with no .rodata, like basic-aarch64 for example:

  .text
          .globl _start
          .type _start, %function
  _start: 
          ret

- createPhdrs() will create and allocate the ELF to the RO program header as it is the first PT_LOAD.
- InputSections are assigned to OutputSections, there are no InputSections assigned to the RO program segment so it contains no SHF_ALLOC sections.
- fixSectionAlignments assigns alignTo(script->getDot(), config->maxPageSize) + (script->getDot() % config->maxPageSize)  to addrExpr for both the ELF Header SyntheticSection (RO program segment) and .text (Executable program segment)
- AssignAddresses sets initial dot to 0x200000 (AArch64) + sizeof(headers) = 0x200120
- AssignAddresses sets base address of .text to 0x210120 (as .text is SHF_ALLOC), the addrExpr for the ELF Header SyntheticSection (RO program header) is never used.
- allocateHeaders searches for the first SHF_ALLOC section and finds .text as there are no SHF_ALLOC sections in the RO program segment.
- allocateHeaders sets the ELF header to the address of the .text section
- The RO program segment now overlaps the Executable program segment as the ELF Header is in the Executable program segment.

My understanding of the proposed solution is to not call allocateHeaders() and instead, effectively, force them into the RO program segment as we always know it will be first. This permits some simplification of the code. I think your comments on -Ttext aren't directly related to the code change, more of an observation. Let me know if I have that right?

One downside of this approach, albeit not a regression from the old behaviour, is that there is always an RO program segment. An alternative fix would be to detect that a program segment only contained the ELF Header and Program Header and move them to the first program segment that contained a SHF_ALLOC section. This would simplify the output when there is no .rodata  to something like (AArch64 values simulated by producing a file with just rodata):

  PHDR           0x000040 0x0000000000200040 0x0000000000200040 0x0000e0 0x0000e0 R   0x8
  LOAD           0x000000 0x0000000000200000 0x0000000000200000 0x000124 0x000124 RE 0x10000

Any thoughts?

For the bit about whether PT_PHDR should be generated. The ELF spec says in the PT_PHDR description:

  Moreover, it may occur only if the program header table is part of the memory image of the program. If it is present, it must precede any loadable segment entry. See ``Program Interpreter'' below for more information.

ld.bfd seems to generate PT_PHDR if a PT_INTERP is needed. When I made a simple a.o from int main(void) { return 0; } I got a PT_PHDR from ld.bfd It is possible that in your example:

  ld.bfd -o a -Ttext=0x3000 a.o => no PT_PHDR
  ld.bfd -o a -Ttext=0x3000 a.o dummy.so (--enable-separate-code [1]) => place .note.gnu.property at 0x4000e8.

your a.o might not have enough in it to generate a PT_INTERP so no PT_PHDR.

I think that as long as our default linker script without -nmagic or -omagic can allocate headers which I think your change ensures I think it is fine for LLD to generate PT_PHDR. I consider not producing it due to no PT_INTERP an optimization.

Repository:
  rLLD LLVM Linker

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D67325/new/

https://reviews.llvm.org/D67325