[PATCH] D112116: [llvm-objcopy] Add --update-section

Mon Nov 15 17:04:05 PST 2021

leonardchan added inline comments.

================
Comment at: llvm/test/tools/llvm-objcopy/ELF/update-section.test:95-98
+# NORMAL:   Name: .in_segment
+# NORMAL:   Size: 8
+# NORMAL:   |11112222|
+# NORMAL:   Name: .unmodified_in_segment
----------------
jhenderson wrote:
> Sorry for not bringing this up until now, but I'm a bit nervous about not using -NEXT in these checks. I know they're not in adjacent lines, so it's a little tricky, and it's fairly unlikely you'll get spurious passes by the checks matching following sections, but it still makes me twitchy. It may also degrade failure reports from FileCheck, as it might start matching at a later (incorrect) point.
> 
> I'd suggest perhaps the following pattern:
> ```
> # NORMAL:      Size:
> # NORMAL-SAME:      {{ }}8{{$}}
> ```
> 
> The addition of the `{{ }}` and `{{$}}` ensure that the digit isn't part of a wider number (otherwise it would match `6789`, for example, not just `8`.
> 
> Similar comments apply for the Offset and Size fields elsewhere below. For the Content, you probably want to do a similar approach too, to ensure you're matching that section's contents, and not some other section.
This is my first time learning of the `-SAME` directive. Very useful! Updated tests.

================
Comment at: llvm/test/tools/llvm-objcopy/ELF/update-section.test:14
+
+# Update segment section with a smaller chunk of data. This will only overwrite the start of the original data.
+# RUN: echo -n 11 > %t3.text
----------------
jhenderson wrote:
> leonardchan wrote:
> > jhenderson wrote:
> > > jhenderson wrote:
> > > > I think this does mean the section size as recorded in the section header is reduced though, right? Probably say "This will only overwrite the start of the original data and adjust the section header."
> > > > 
> > > > Should we also have a test case without a segment?
> > > It's a bit fiddly, but I think you should test the bytes that appear immediately after this section to show that the contents are unchanged. It's not necessarily an issue if they are/aren't changed, but I think we need to make a conscious decision should this behaviour ever be changed, hence the test.
> > > 
> > > Also, some more testing that occurred to me: we should have a section immediately following the to-be-shrunk section, and show that its offset hasn't changed (or more specifically, its offset relative to the segment start), to show that the shrinking doesn't cause the other section to bunch up.
> > > It's a bit fiddly, but I think you should test the bytes that appear immediately after this section to show that the contents are unchanged. It's not necessarily an issue if they are/aren't changed, but I think we need to make a conscious decision should this behaviour ever be changed, hence the test.
> > 
> > I think I may want to push back on this part since this feels like it would be testing against undefined behavior (by checking for a value that we don't explicitly set/guarantee). I think the user just wants to ensure that the smaller chunk is written (and the size is updated), but not necessarily what the remaining data is. It just so happens that under the hood, we do sequential memcpy's that result on the old data still being there, but end-to-end-behavior-wise, I think we don't want to explicitly test the implementation.
> > 
> > I see that the test comment actually points this out, which is my bad. So I'll update this.
> > 
> > (I'll also admit that I can't find an easy way to test this other than doing a hexdump and inspecting specific bits, which seems a bit complex :)
> > 
> > > Also, some more testing that occurred to me: we should have a section immediately following the to-be-shrunk section, and show that its offset hasn't changed (or more specifically, its offset relative to the segment start), to show that the shrinking doesn't cause the other section to bunch up.
> > 
> > Added a test for this.
> Elsewhere, llvm-objcopy's rules for what gets written in the gaps within segments which are not covered by sections are well defined. Specifically, the existing contents are preserved, unless a section has been explicitly removed, in which case zeroes are written. The logic for the former is that there may be important data stored outside a section, such as interrupt instructions to prevent overrunning the end of a function or similar. The latter makes sense because you've deliberately removed a section, so you shouldn't leave its data lying around (maybe it was removed for reasons of secrecy etc). Arguably, the same case could be made for `--update-section`, but I could see it either way.
> 
> I've used `od` in the past to dump hex bytes of segments. See, for example, preserve-segment-contents.test. It's not actually that hard, as long as you know the offset (NB: I'm going off memory of that test, which might be incorrect, but there should be other examples if I'm wrong).
> 
> As an aside, if and when --gap-fill is ever implemented, it should definitely overwrite the data left behind when shrinking the section. Also, if --pad-to is implemented, it may be possible to use similar logic to allow --update-section to grow the section (otherwise we're in a weird situation where --pad-to can be used to grow the section, but not --update-section!). Not suggesting this as part of this patch, but I'm going to raise these points on D67689 too.
Ah, thank you for the clarifications and tests I can reference. Added appropriate test cases.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D112116/new/

https://reviews.llvm.org/D112116