[PATCH] D33964: [LLVM][llvm-objcopy] Added basic plumbing to get things started

Thu Jul 13 17:53:42 PDT 2017

jakehehrlich marked 5 inline comments as done.
jakehehrlich added inline comments.

================
Comment at: tools/llvm-objcopy/Object.cpp:193
+      // image is a copy of the file image, we preserve the file image as well.
+      Offset = FirstInSeg->Offset + Section->OriginalOffset -
+               FirstInSeg->OriginalOffset;
----------------
jhenderson wrote:
> I've just thought about yet another problem with this:
> 
> In some cases, the ELF header and/or Program Header table can appear in a segment before the first section. Consequently, we need to preserve the relative offset to the start of the section, rather than the first segment. At the moment, any space before the first section within the segment is ignored and lost.
> 
> This applies to Segment::finalize() too.
Yea PT_PHDR is a problem I have run into in extensions of this diff. I think it needs some special handling. I think I can support it when I officially add support for copying data of interstitial gaps and nested segments. It will still require special care however. Would you be ok with me not attempting to handle this right now? At the very least this requires handling nested segments because the PT_PHDR segment must be nested in a PT_LOAD segment.

The needed information is preserved now because the original offset of the segment is stored until the segment offset needs to be computed and the original offset of the section is now stored and kept. So that gap can be recovered when performing layout of the segment. This seems to be enough to confirm that there isn't a fundamental design issue that prevents the PT_PHDR case from being handled.

================
Comment at: tools/llvm-objcopy/Object.cpp:261-271
+  Elf_Shdr &Shdr = *reinterpret_cast<Elf_Shdr *>(Buf);
+  Shdr.sh_name = 0;
+  Shdr.sh_type = SHT_NULL;
+  Shdr.sh_flags = 0;
+  Shdr.sh_addr = 0;
+  Shdr.sh_offset = 0;
+  Shdr.sh_size = 0;
----------------
jhenderson wrote:
> Nothing is using Shdr here. Is this missing a write, or can it be removed?
This is actually correct I believe. This does point out a lack of clarity in what is going on here however. 

As is every section knows where their header belongs so I just pass "Out" to the section and it does the rest. This highlighted code writes the dummy section at the start of the section header table. So there's no missing write or anything. I'll add a comment explaining what's going on here.

================
Comment at: tools/llvm-objcopy/Object.h:136-141
+  uint64_t Entry;
+  uint64_t SHOffset;
+  uint32_t Type;
+  uint32_t Machine;
+  uint32_t Version;
+  uint32_t Flags;
----------------
jhenderson wrote:
> The size of some of these fields is dependent on the Elf Type (64 versus 32). I wonder if these should be based on ELFT? On the other hand, I'm quite happy that we just use the largest size for now.
> 
> At the moment, since the template is only instantiated for Elf64LE, this is fine, but it won't work once 32-bit support is added, unless the point where they are assigned to the Elf Header struct is modified to cast the values accordingly.
This was a recommendation that I got from Rui actually. To quote Rui when speaking about LLD.

> - Integral types such as Elf{32,64}_{XWord,Word,Addr,Offset} are not useful and better to avoid. We are using uint{8,16,32,64}_t instead. It seems it improves readability. (I honestly don't memorize > the real types of these ELFT types.)
> - ELFT::uint, whose size is 32/64 depending on ELF32/64, isn't useful. Always uint64_t to represent a value that can be 32 or 64. The waste of doing this is negligible, but it could drastically simplify > types because if your function uses only ELFT::uint, you can de-template that function by using uint64_t.

I've found this to be true as well. It definitely lets you get rid of a lot of templates. If we just used the templates when you had access to them (Like in Object here) and used the largest type when we didn't (like in Segment or SectionBase) we would get this strange inconsistency in our code. I'd like to avoid that.

Repository:
  rL LLVM

https://reviews.llvm.org/D33964