[PATCH] D56806: [llvm-objcopy] Fix crash when writing empty binary output

Thu Jan 24 09:33:12 PST 2019

rupprecht marked 2 inline comments as done.
rupprecht added inline comments.

================
Comment at: llvm/tools/llvm-objcopy/Buffer.cpp:31
+  if (std::error_code EC = sys::Process::SafelyCloseFileDescriptor(FD))
+    reportError(FileName, EC);
+}
----------------
jhenderson wrote:
> rupprecht wrote:
> > rupprecht wrote:
> > > alexshap wrote:
> > > > rupprecht wrote:
> > > > > alexshap wrote:
> > > > > > so given we are trying to get rid of "reportError" (to enable us to build a library from this code in the future), maybe we can return an error from the method "allocate" instead ? (similarly to what "commit" does)
> > > > > SGTM, though I'd like to punt on changing the signature of allocate() in this patch. I'll start working on a separate patch now...
> > > > oh, probably my bad,
> > > > my second thought is the following: basically if i remember correctly we call Buffer::commit in the very end (and in the case of FileBuffer it does smth like renaming the temporary file backing the buffer (which was mmaped into the memory)  - i don't remember precisely, need to double check it).
> > > > If there is an error in the tool somewhere in the middle of the whole pipeline, the files are untouched and I think it's a nice property to have. But if we truncate a file inside the method "allocate" this won't be true anymore. So ideally all the "real" changes should be applied to the file system only inside the "commit" method + would be cool to have a test for this scenario. What do you think ?
> > > No worries, error handling here looks like something we should fix. I have something in progress (https://reviews.llvm.org/differential/diff/182462/) but I actually want to increase the scope of that refactoring, so it's not ready yet. I just want it to be orthogonal to this crash fix :)
> > > 
> > > I see what you mean about commit(). In the instance of the crash I'm fixing, it doesn't seem like it's at all possible for a crash in between finalize() and write(), but I can't say I've ruled out other situations where Size = 0.
> > > I don't know if it's testable, but I'll try to move this block to commit().
> > I couldn't come up with a way to add a test for this, but I manually verified it by throwing in an assert(false) and making sure the file isn't changed.
> > 
> > ```
> > $ echo abcd > /tmp/existing-file.txt
> > $ llvm-objcopy -R .text -O binary binary-output-empty.test.tmp.o /tmp/existing-file.txt
> > <crash>
> > $ cat /tmp/existing-file.txt
> > abcd
> > ```
> > (Before your suggestion, it was printing an empty file)
> > 
> > Given the state of the code, I don't think a crash is really possible -- unless something like malloc failed -- so I don't have a test to offer.
> This is probably fine as is, but a couple of ideas I had where this might fail (not 100% sure) is a) if another file has a lock on it, or b) if the file is read-only maybe? Are either of those testable in the current framework? I think you could test a) by using python to open it and then spawn llvm-objcopy without closing the file.
I think both of those would describe createAndTruncateFile() failures. If commit() fails, then that's "fine" in the sense that there are no side effects -- any file modification should fail. The original problem was:

  # allocate() is called, which truncates some existing data
  # some extra code is called, which crashes
  # commit() is never called, so the processed object is never written, but the original data is still destroyed

vs as implemented in the latest version of the patch:

  # allocate() is called, which remembers to defer truncating until later
  # some extra code is called, which crashes
  # commit() is never called, so truncation never happens and the original data is still there

But, I don't see any reasonable ways for the part in between to crash. The relevent snippet I'm trying to break is this:

```
  // From finalize()
  for (auto &Section : Obj.sections()) {
    if ((Section.Flags & SHF_ALLOC) == 0)
      continue;
    AllocatedSections.push_back(&Section);
  }
  TotalSize = 0;
  for (const auto &Section : AllocatedSections) {
    if (Section->Type != SHT_NOBITS)
      TotalSize = std::max(TotalSize, Section->Offset + Section->Size);
  }
  Buf.allocate(TotalSize);
  SecWriter = llvm::make_unique<BinarySectionWriter>(Buf);

  // From write()
  for (auto &Section : Obj.sections()) {
    if ((Section.Flags & SHF_ALLOC) == 0)
      continue;
    Section.accept(*SecWriter);
  }
  return Buf.commit();
```

The things that could fail:
  - SecWriter construction fails, e.g. malloc failure
  - Section.accept() fails, but given the constraint that TotalSize is 0 (in order for allocate(0) to be called), the only sections it could be called on would be SHT_NOBITS or have zero offset/size. It looks like section visitors check SHT_NOBITS and return early, and I don't know if zero offset/size is valid or possible to construct.

================
Comment at: llvm/tools/llvm-objcopy/Buffer.cpp:57
+
+  assert(Buf && "allocate() not called before commit()!");
   Error Err = Buf->commit();
----------------
jhenderson wrote:
> Instead of this assert, how practical would it be to call allocate in the constructor?
allocate() is often called long after construction, e.g. llvm-objcopy.cpp creates the FileBuffer and ELF/Object.cpp calls allocate() after the object has been processed (and then it knows how much to allocate).

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D56806/new/

https://reviews.llvm.org/D56806