[PATCH] D33964: [LLVM][llvm-objcopy] Added basic plumbing to get things started

James Henderson via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Thu Jul 6 01:53:16 PDT 2017


jhenderson added a comment.

By the way, although it makes it easier for me to see the changes, your current diff only shows the changes versus the previous diff, so has lost, for example, the tests.

As noted in one of my comments, I'm not certain that this will work in the nested segment case. Is there a test for this?



================
Comment at: tools/llvm-objcopy/Object.cpp:185
+      // We should respect interstitial gaps of allocated sections
+      Offset = FirstInSeg->Offset + Section->Addr - FirstInSeg->Addr;
+    }
----------------
jakehehrlich wrote:
> jhenderson wrote:
> > Could you explain why we are using a mix of offsets and addresses here, as it is surprising to me.
> The first an main point is that the Offsets change in this loop and what we want to respect is the difference between the section offsets in the original file. FirstInSeg->Offset no longer respects that information. This is changeable however by adding fields that keep track of the original file offset. This might make sense to use elsewhere as well like in the sorting algorithm where we're trying to respect the original offset as much as possible. This would free up Offset to be messed with (in what way isn't clear) before sorting occurs. So this is changeable but only by adding another field. Perhaps that is not a bad idea however. 
> 
> This said when I was thinking about this I never actually even thought to use Offset. I was thinking about respecting the memory image not the file image. I think as a guiding principal "respect the memory image" is a a better guiding principal but only slightly. The memory image and file image should agree for all valid files however so I'm not really opposed to changing it. I have a slight opposition but it's based on some ideas I have for changing the file layout algorithm and I don't want to justify code here based on code that hasn't been written yet. The general idea is that it is possible to enforce "respect the memory image" via const correctness/encapsulation but it is not possible to do the same for the file image because Offset will need to be changed.
> 
> My personal opinion is that adding a new Offset like field that isn't quite Offset and using that field to compute the new offset here isn't really any clearer that what I have here. What do you think?
I think I follow what you're saying and broadly agree with it. We have to respect the memory image, at least for linked objects, because if we change the relative positions of two sections within a segment, then the address mapping will no longer be correct. For relocatable objects, that's not an issue, because they don't have a memory mapping yet (and segments can be ignored in this case).

However, the file and memory image are not always consistent, even for valid programs. Specifically, SHT_NOBITS sections have no file image typically, but do have a memory image. Usually, this is not an issue, because the .bss section is placed at the end of the data segment. However, when using TLS (which is sometimes a nested segment in the data segment) this is not the always the case due to .tbss, because other non-TLS data usually appears after it. From my observations, different platforms exhibit different behaviour here, but at least some of them have no file image assigned for the .tbss section, but do have a memory image associated with it. As such, the layout algorithm needs to be able to handle this case.

That all being said, looking at the code without actually trying it out, I'm not convinced that this will handle the nested-segment case anyway, in which case what you've got here is fine. However, I would suggest updating the comment to explain why we need to respect interstitial memory gaps.


================
Comment at: tools/llvm-objcopy/Object.cpp:197-198
+  // aligned properlly however so we should align it as needed. This only takes
+  // a little bit of tweaking to ensure that the sh_name is 4 byte aligned.
+  Offset = alignTo(Offset, sizeof(typename ELFT::Word));
+  SHOffset = Offset;
----------------
jakehehrlich wrote:
> jhenderson wrote:
> > Why only the sh_name field? 4-byte alignment could cause other struct members to be misaligned (e.g. the sh_addr and sh_offset fields), because they are 8-bytes in size.
> The next field (sh_type) is also a 4-byte word and thus anything after those two will be 8-byte aligned. So this is the minimal alignment I can choose which ensures that all the fields of the section header are aligned. I updated the comment to reflect this fact.
Relative to the start of the section header, the fields will be aligned correctly. However, relative to the start of the file, they won't be: for example, in 64-bit ELF, with 8 byte address fields, the sh_name field was aligned to offset 4, the sh_type field will be at offset 8, and the 8-byte-sized sh_flags field will be at offset 12, which is misaligned.


Repository:
  rL LLVM

https://reviews.llvm.org/D33964





More information about the llvm-commits mailing list