[PATCH] D13989: This removes the eating of the error in Archive::Child::getSize() when the charactersin the size field in the archive header for the member is not a number.

Rafael Espíndola via llvm-commits llvm-commits at lists.llvm.org
Mon Nov 2 05:38:26 PST 2015


I have changed code that was using archive_iterator in cases where
end() (or error) was not possible to use Child.

This reduced the impact of this change. An rebased patch is attached.

In addition to the previous issue, I noticed that we use 'auto' in
case the style guide recommends against:

  if (auto Err = ChildOrErr.getError())

should be just std::error_code.


-    outs() << ' ' << format("%6llu", C.getSize());
+    ErrorOr<uint64_t> Size = C.getSize();
+    if (Size.getError())
+      outs() << ' ' << "bad size";

It would be much better for llvm-ar (and any tool IMHO) to fail when
finding a corrupted file. In the case of llvm-ar, just call fail.


Cheers,
Rafael


On 31 October 2015 at 19:08, Rafael Espíndola
<rafael.espindola at gmail.com> wrote:
> The main problem with the patch was making Archive.cpp more
> complicated. I have fixed that by not storing child_iterators in the
> Archive.
>
> Now I think the main remaining issues are cases like
>
> -    if (ChildIt != A->child_end()) {
> +    if (*ChildIt && ChildIt != A->child_end()) {
>
> This silently ignores the error. To keep this patch small, I would
> suggest that for now this be changed to
>
> If (std::error_code EC = ChildIt->getError())
>   report_fatal_error(EC.message());
> if (ChildIt != A->child_end()) {
> ...
>
> A followup patch can propagate the error.
>
> I have attached a patch rebased on trunk that stores an ErrorOr<Child>
> in the iterator.
>
> Cheers,
> Rafael
>
>
>
>
> On 31 October 2015 at 10:14, Rafael Espíndola
> <rafael.espindola at gmail.com> wrote:
>> OK, I am starting to lean a bit more on the idea of the iterator
>> having an ErrorOr<Child>. I will take a second look at that option.
>>
>> Sorry for the detour.
>>
>> Cheers,
>> Rafael
>>
>>
>> On 29 October 2015 at 22:53, Lang Hames via llvm-commits
>> <llvm-commits at lists.llvm.org> wrote:
>>> That's a good point: next() has to return an ErrorOr<iterator>, and you have
>>> to distinguish between failure and end(), so if you want to check each call
>>> to next() you need to do something like:
>>>
>>> auto ErrOrItr = File->error_or_child_begin();
>>> if (!ErrOrItr)
>>>   return ErrOrItr.getError();
>>> auto Itr = *ErrOrItr;
>>> while (Itr != file->child_end()) {
>>>   auto &Thing = *Itr;
>>>
>>>   ErrOrItr = increment(Itr);
>>>   if (!ErrOrItr)
>>>     return ErrOrItr.getError();
>>> }
>>>
>>> If you write an iterator wrapper that includes a failure state you can get
>>> that down to:
>>>
>>> auto ErrOrItr = File->child_begin();
>>> for (; ErrOrItr && ErrOrItr != File->child_end(); ErrOrItr.next()) {
>>>   Child = *ErrOrItr;
>>>   // ...
>>> }
>>> if (!ErrOrItr)
>>>   return ErrOrItr.getError();
>>>
>>> But now you have to be careful to check the error outside the loop.
>>>
>>> Looking at Kevin's updated diff, the change to the iterators made those
>>> loops a lot uglier. The introduction of the Increment function fixes some of
>>> that, but at the cost of risking the problem that you were trying to avoid:
>>> It's suddenly very easy to "increment" the iterator without actually
>>> checking the error.
>>>
>>> - Lang.
>>>
>>>
>>> On Thu, Oct 29, 2015 at 8:30 PM, Kevin Enderby <enderby at apple.com> wrote:
>>>>
>>>> Hi Lang and others,
>>>>
>>>> This iterator basically requires another indirection to get to the
>>>> underlying Child.  So you get a compile error when trying to use the
>>>> iterator as it is an <ErrorOr>.  So the example really is to change
>>>> something like this (as the diff in ELF/InputFiles.cpp):
>>>>
>>>> for (const Archive::Child &Child : File->children()) {
>>>>>>>>
>>>> to this:
>>>>
>>>>  for (auto &ChildOrErr : File->children()) {
>>>>     error(ChildOrErr, "Could not get the child of the archive " +
>>>>                           File->getFileName());
>>>>     const Archive::Child Child(*ChildOrErr);
>>>>   ...
>>>>
>>>> Which seems a bit cleaner than testing the result of child_begin() before
>>>> the for loop and checking getNext() even if it can be wrapped in some case
>>>> in and Increment() routine.
>>>>
>>>> My thoughts,
>>>> Kev
>>>>
>>>> On Oct 29, 2015, at 7:50 PM, Lang Hames via llvm-commits
>>>> <llvm-commits at lists.llvm.org> wrote:
>>>>
>>>> Yep. The standard C++ iterator concept doesn't apply here, since it
>>>> doesn't deal with failure, and checking on deref rather than increment means
>>>> you can accidentally do multiple increments without checking. My intent was
>>>> to have increment crash if you run it on an error'd out iterator, but that's
>>>> not as good as a compile-time check. I think it's a matter of personal taste
>>>> whether the cost (IMHO fairly minimal, since in practice you almost always
>>>> deref after increment) is worth paying for the gain (also arguably minimal)
>>>> of a neater iteration idiom.
>>>>
>>>> I don't have particularly strong feelings either way.
>>>>
>>>> - Lang.
>>>>
>>>> On Thu, Oct 29, 2015 at 5:51 PM, David Blaikie <dblaikie at gmail.com> wrote:
>>>>>
>>>>>
>>>>>
>>>>> On Thu, Oct 29, 2015 at 5:27 PM, Lang Hames via llvm-commits
>>>>> <llvm-commits at lists.llvm.org> wrote:
>>>>>>
>>>>>> Hi Rafael,
>>>>>>
>>>>>> How could you ignore the error in Kevin's solution?
>>>>>>
>>>>>> I think your solution and his are essentially the same, except that you
>>>>>> check the error on increment and he checks it on dereference. In practice
>>>>>> you always have to do both operations (you can't do multiple increments
>>>>>> without checking the error anyway, and you can just do that by deref'ing),
>>>>>> and checking via deref allows you to used range-based for, which makes the
>>>>>> code neater.
>>>>>
>>>>>
>>>>> I'm not taking a position either way - but that does present a difficult
>>>>> problem for the iterator contract, to say that you can't increment it twice
>>>>> without dereferencing... that's not really in the iterator playbook/contract
>>>>> (could implement a little extra support for that by allowing an error-state
>>>>> iterator to be incremented to become the end iterator). I know you've
>>>>> mentioned it before (the "can't increment twice without checking") but it
>>>>> became a bit clearer to me in this recent email exchange to the point that I
>>>>> would be a little less comfortable with it.
>>>>>
>>>>> I can't find the example you guys had:
>>>>>
>>>>> for (auto &Thing : things) {
>>>>>   if (!Thing)
>>>>>     return Thing.getError();
>>>>>   ...
>>>>> }
>>>>>
>>>>> & yeah, nothing that comes to mind is as tidy/simple as that, my nearest
>>>>> might be:
>>>>>
>>>>> auto Iter = things.begin();
>>>>> while (auto &Thing = Iter.Next()) {
>>>>>   ....
>>>>> }
>>>>> if (!Iter)
>>>>>   return Iter.getError();
>>>>>
>>>>> (where "Iter" isn't a traditional iterator in any sense - it has
>>>>> ErrorOr<T> Next() and getError (& possibly boolean testability))
>>>>>
>>>>> The obvious sort of problem I imagine arising from the iterator/range API
>>>>> is something like:
>>>>>
>>>>>   int count = std::distance(things.begin(), things.end());
>>>>>
>>>>> Easy to write, and would produce an infinite loop (or crash?) if there
>>>>> was an error during iteration.
>>>>>
>>>>> - Dave
>>>>>
>>>>> - Dave
>>>>>
>>>>>>
>>>>>>
>>>>>> Cheers,
>>>>>> Lang.
>>>>>>
>>>>>> On Thu, Oct 29, 2015 at 5:11 PM, Rafael Espíndola
>>>>>> <rafael.espindola at gmail.com> wrote:
>>>>>>>
>>>>>>> On 28 October 2015 at 15:21, Kevin Enderby <enderby at apple.com> wrote:
>>>>>>> > enderby added a comment.
>>>>>>> >
>>>>>>> > I agree with Lang and like the original patch where were were
>>>>>>> > handling failure was done inside the iterator.  It does allow use of
>>>>>>> > ranged-based for loops and appears to me much cleaner in the code especially
>>>>>>> > in the lld changes needed in http://reviews.llvm.org/D13990 .
>>>>>>>
>>>>>>> I think that the most important consideration is making the error hard
>>>>>>> to ignore. That means using ErrorOr.
>>>>>>>
>>>>>>> The sad reality is that the archive format can cause us to find an
>>>>>>> error when going from one member to the next. That, with the above
>>>>>>> requirement means that it doesn't fit on range loop.
>>>>>>>
>>>>>>> > Rafael, I did "finish it up" based on your t.diff patch and that is
>>>>>>> > below and includes the re-formatting with clang-format-diff, removal of the
>>>>>>> > malformed-archives directory putting the .a files in Inputs and removal of
>>>>>>> > the "nice but independent" changes.
>>>>>>> >
>>>>>>> > F1026846: llvm_updated_t.diff <http://reviews.llvm.org/F1026846>
>>>>>>>
>>>>>>>
>>>>>>> Thanks!
>>>>>>>
>>>>>>> You don't need to change ErrorOr.h.
>>>>>>>
>>>>>>> The change to detect the end of the archive seems more complicated
>>>>>>> than necessary (see the attached patch for a suggestion).
>>>>>>>
>>>>>>> Cases like the Increment in BinaryHolder expose a naked EC, which is
>>>>>>> what using ErrorOr is trying to avoid. Just inlining it solves the
>>>>>>> problem (see patch).
>>>>>>>
>>>>>>> Please convert the remaining user of Increment to return void when
>>>>>>> possible.
>>>>>>>
>>>>>>> Cheers,
>>>>>>> Rafael
>>>>>>
>>>>>>
>>>>>>
>>>>>> _______________________________________________
>>>>>> llvm-commits mailing list
>>>>>> llvm-commits at lists.llvm.org
>>>>>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-commits
>>>>>>
>>>>>
>>>>
>>>> _______________________________________________
>>>> llvm-commits mailing list
>>>> llvm-commits at lists.llvm.org
>>>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-commits
>>>>
>>>>
>>>
>>>
>>> _______________________________________________
>>> llvm-commits mailing list
>>> llvm-commits at lists.llvm.org
>>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-commits
>>>
-------------- next part --------------
diff --git a/include/llvm/Object/Archive.h b/include/llvm/Object/Archive.h
index a2797ad..9cde3a7 100644
--- a/include/llvm/Object/Archive.h
+++ b/include/llvm/Object/Archive.h
@@ -66,7 +66,7 @@ public:
     bool isThinMember() const;
 
   public:
-    Child(const Archive *Parent, const char *Start);
+    Child(const Archive *Parent, const char *Start, std::error_code *EC);
     Child(const Archive *Parent, StringRef Data, uint16_t StartOfFile);
 
     bool operator ==(const Child &other) const {
@@ -75,7 +75,7 @@ public:
     }
 
     const Archive *getParent() const { return Parent; }
-    Child getNext() const;
+    ErrorOr<Child> getNext() const;
 
     ErrorOr<StringRef> getName() const;
     StringRef getRawName() const { return getHeader()->getName(); }
@@ -91,9 +91,9 @@ public:
       return getHeader()->getAccessMode();
     }
     /// \return the size of the archive member without the header or padding.
-    uint64_t getSize() const;
+    ErrorOr<uint64_t> getSize() const;
     /// \return the size in the archive header for this member.
-    uint64_t getRawSize() const;
+    ErrorOr<uint64_t> getRawSize() const;
 
     ErrorOr<StringRef> getBuffer() const;
     uint64_t getChildOffset() const;
@@ -105,24 +105,32 @@ public:
   };
 
   class child_iterator {
-    Child child;
+    ErrorOr<Child> child;
 
   public:
-    child_iterator() : child(Child(nullptr, nullptr)) {}
+    child_iterator() : child(Child(nullptr, nullptr, nullptr)) {}
     child_iterator(const Child &c) : child(c) {}
-    const Child *operator->() const { return &child; }
-    const Child &operator*() const { return child; }
+    child_iterator(std::error_code EC) : child(EC) {}
+    const ErrorOr<Child> *operator->() const { return &child; }
+    const ErrorOr<Child> &operator*() const { return child; }
 
     bool operator==(const child_iterator &other) const {
-      return child == other.child;
+      // We ignore error states so that comparisions with end() work, which
+      // allows range loops.
+      if (child.getError() || other.child.getError())
+        return false;
+      return *child == *other.child;
     }
 
     bool operator!=(const child_iterator &other) const {
       return !(*this == other);
     }
 
+    // Code in loops with child_iterators must check for errors on each loop
+    // iteration.  And if there is an error break out of the loop.
     child_iterator &operator++() { // Preincrement
-      child = child.getNext();
+      assert(child && "Can't increment iterator with error");
+      child = child->getNext();
       return *this;
     }
   };
diff --git a/include/llvm/Support/ErrorOr.h b/include/llvm/Support/ErrorOr.h
index 46b6192..ca6ede7 100644
--- a/include/llvm/Support/ErrorOr.h
+++ b/include/llvm/Support/ErrorOr.h
@@ -91,6 +91,7 @@ private:
   typedef typename std::remove_reference<T>::type &reference;
   typedef const typename std::remove_reference<T>::type &const_reference;
   typedef typename std::remove_reference<T>::type *pointer;
+  typedef const typename std::remove_reference<T>::type *const_pointer;
 
 public:
   template <class E>
@@ -183,10 +184,14 @@ public:
     return toPointer(getStorage());
   }
 
+  const_pointer operator->() const { return toPointer(getStorage()); }
+
   reference operator *() {
     return *getStorage();
   }
 
+  const_reference operator*() const { return *getStorage(); }
+
 private:
   template <class OtherT>
   void copyConstruct(const ErrorOr<OtherT> &Other) {
@@ -246,10 +251,14 @@ private:
     return Val;
   }
 
+  const_pointer toPointer(const_pointer Val) const { return Val; }
+
   pointer toPointer(wrap *Val) {
     return &Val->get();
   }
 
+  const_pointer toPointer(const wrap *Val) const { return &Val->get(); }
+
   storage_type *getStorage() {
     assert(!HasError && "Cannot get value when an error exists!");
     return reinterpret_cast<storage_type*>(TStorage.buffer);
diff --git a/lib/ExecutionEngine/MCJIT/MCJIT.cpp b/lib/ExecutionEngine/MCJIT/MCJIT.cpp
index bbe4432..357409d 100644
--- a/lib/ExecutionEngine/MCJIT/MCJIT.cpp
+++ b/lib/ExecutionEngine/MCJIT/MCJIT.cpp
@@ -318,10 +318,10 @@ RuntimeDyld::SymbolInfo MCJIT::findSymbol(const std::string &Name,
     object::Archive *A = OB.getBinary();
     // Look for our symbols in each Archive
     object::Archive::child_iterator ChildIt = A->findSym(Name);
-    if (ChildIt != A->child_end()) {
+    if (*ChildIt && ChildIt != A->child_end()) {
       // FIXME: Support nested archives?
       ErrorOr<std::unique_ptr<object::Binary>> ChildBinOrErr =
-          ChildIt->getAsBinary();
+          (*ChildIt)->getAsBinary();
       if (ChildBinOrErr.getError())
         continue;
       std::unique_ptr<object::Binary> &ChildBin = ChildBinOrErr.get();
diff --git a/lib/ExecutionEngine/Orc/OrcMCJITReplacement.h b/lib/ExecutionEngine/Orc/OrcMCJITReplacement.h
index d1aff5a..e0aacd8 100644
--- a/lib/ExecutionEngine/Orc/OrcMCJITReplacement.h
+++ b/lib/ExecutionEngine/Orc/OrcMCJITReplacement.h
@@ -252,10 +252,10 @@ private:
       object::Archive *A = OB.getBinary();
       // Look for our symbols in each Archive
       object::Archive::child_iterator ChildIt = A->findSym(Name);
-      if (ChildIt != A->child_end()) {
+      if (*ChildIt && ChildIt != A->child_end()) {
         // FIXME: Support nested archives?
         ErrorOr<std::unique_ptr<object::Binary>> ChildBinOrErr =
-            ChildIt->getAsBinary();
+            (*ChildIt)->getAsBinary();
         if (ChildBinOrErr.getError())
           continue;
         std::unique_ptr<object::Binary> &ChildBin = ChildBinOrErr.get();
diff --git a/lib/Object/Archive.cpp b/lib/Object/Archive.cpp
index d50a699..88b3851 100644
--- a/lib/Object/Archive.cpp
+++ b/lib/Object/Archive.cpp
@@ -46,7 +46,7 @@ StringRef ArchiveMemberHeader::getName() const {
 ErrorOr<uint32_t> ArchiveMemberHeader::getSize() const {
   uint32_t Ret;
   if (llvm::StringRef(Size, sizeof(Size)).rtrim(" ").getAsInteger(10, Ret))
-    return object_error::parse_failed;
+    return object_error::parse_failed; // Size is not a decimal number.
   return Ret;
 }
 
@@ -86,7 +86,8 @@ Archive::Child::Child(const Archive *Parent, StringRef Data,
                       uint16_t StartOfFile)
     : Parent(Parent), Data(Data), StartOfFile(StartOfFile) {}
 
-Archive::Child::Child(const Archive *Parent, const char *Start)
+Archive::Child::Child(const Archive *Parent, const char *Start,
+                      std::error_code *EC)
     : Parent(Parent) {
   if (!Start)
     return;
@@ -94,7 +95,10 @@ Archive::Child::Child(const Archive *Parent, const char *Start)
   uint64_t Size = sizeof(ArchiveMemberHeader);
   Data = StringRef(Start, Size);
   if (!isThinMember()) {
-    Size += getRawSize();
+    ErrorOr<uint64_t> MemberSize = getRawSize();
+    if ((*EC = MemberSize.getError()))
+      return;
+    Size += MemberSize.get();
     Data = StringRef(Start, Size);
   }
 
@@ -110,20 +114,20 @@ Archive::Child::Child(const Archive *Parent, const char *Start)
   }
 }
 
-uint64_t Archive::Child::getSize() const {
+ErrorOr<uint64_t> Archive::Child::getSize() const {
   if (Parent->IsThin) {
     ErrorOr<uint32_t> Size = getHeader()->getSize();
-    if (Size.getError())
-      return 0;
+    if (std::error_code EC = Size.getError())
+      return EC;
     return Size.get();
   }
   return Data.size() - StartOfFile;
 }
 
-uint64_t Archive::Child::getRawSize() const {
+ErrorOr<uint64_t> Archive::Child::getRawSize() const {
   ErrorOr<uint32_t> Size = getHeader()->getSize();
-  if (Size.getError())
-    return 0;
+  if (std::error_code EC = Size.getError())
+    return EC;
   return Size.get();
 }
 
@@ -133,8 +137,12 @@ bool Archive::Child::isThinMember() const {
 }
 
 ErrorOr<StringRef> Archive::Child::getBuffer() const {
-  if (!isThinMember())
-    return StringRef(Data.data() + StartOfFile, getSize());
+  if (!isThinMember()) {
+    ErrorOr<uint32_t> Size = getSize();
+    if (std::error_code EC = Size.getError())
+      return EC;
+    return StringRef(Data.data() + StartOfFile, Size.get());
+  }
   ErrorOr<StringRef> Name = getName();
   if (std::error_code EC = Name.getError())
     return EC;
@@ -148,7 +156,7 @@ ErrorOr<StringRef> Archive::Child::getBuffer() const {
   return Parent->ThinBuffers.back()->getBuffer();
 }
 
-Archive::Child Archive::Child::getNext() const {
+ErrorOr<Archive::Child> Archive::Child::getNext() const {
   size_t SpaceToSkip = Data.size();
   // If it's odd, add 1 to make it even.
   if (SpaceToSkip & 1)
@@ -156,11 +164,19 @@ Archive::Child Archive::Child::getNext() const {
 
   const char *NextLoc = Data.data() + SpaceToSkip;
 
+  // Check to see if this is at the end of the archive.
+  if (NextLoc == Parent->Data.getBufferEnd())
+    return Child(Parent, nullptr, nullptr);
+
   // Check to see if this is past the end of the archive.
-  if (NextLoc >= Parent->Data.getBufferEnd())
-    return Child(Parent, nullptr);
+  if (NextLoc > Parent->Data.getBufferEnd())
+    return object_error::parse_failed;
 
-  return Child(Parent, NextLoc);
+  std::error_code EC;
+  Child Ret(Parent, NextLoc, &EC);
+  if (EC)
+    return EC;
+  return Ret;
 }
 
 uint64_t Archive::Child::getChildOffset() const {
@@ -255,15 +271,26 @@ Archive::Archive(MemoryBufferRef Source, std::error_code &ec)
   }
 
   // Get the special members.
-  child_iterator i = child_begin(false);
-  child_iterator e = child_end();
+  child_iterator I = child_begin(false);
+  if ((ec = I->getError()))
+    return;
+  child_iterator E = child_end();
 
-  if (i == e) {
+  if (I == E) {
     ec = std::error_code();
     return;
   }
+  const Child *C = &**I;
 
-  StringRef Name = i->getRawName();
+  auto Increment = [&]() {
+    ++I;
+    if ((ec = I->getError()))
+      return true;
+    C = &**I;
+    return false;
+  };
+
+  StringRef Name = C->getRawName();
 
   // Below is the pattern that is used to figure out the archive format
   // GNU archive format
@@ -288,9 +315,11 @@ Archive::Archive(MemoryBufferRef Source, std::error_code &ec)
     Format = K_BSD;
     // We know that the symbol table is not an external file, so we just assert
     // there is no error.
-    SymbolTable = *i->getBuffer();
-    ++i;
-    setFirstRegular(*i);
+    SymbolTable = *C->getBuffer();
+    if (Increment())
+      return;
+    setFirstRegular(*C);
+
     ec = std::error_code();
     return;
   }
@@ -298,7 +327,7 @@ Archive::Archive(MemoryBufferRef Source, std::error_code &ec)
   if (Name.startswith("#1/")) {
     Format = K_BSD;
     // We know this is BSD, so getName will work since there is no string table.
-    ErrorOr<StringRef> NameOrErr = i->getName();
+    ErrorOr<StringRef> NameOrErr = C->getName();
     ec = NameOrErr.getError();
     if (ec)
       return;
@@ -306,10 +335,11 @@ Archive::Archive(MemoryBufferRef Source, std::error_code &ec)
     if (Name == "__.SYMDEF SORTED" || Name == "__.SYMDEF") {
       // We know that the symbol table is not an external file, so we just
       // assert there is no error.
-      SymbolTable = *i->getBuffer();
-      ++i;
+      SymbolTable = *C->getBuffer();
+      if (Increment())
+        return;
     }
-    setFirstRegular(*i);
+    setFirstRegular(*C);
     return;
   }
 
@@ -322,32 +352,34 @@ Archive::Archive(MemoryBufferRef Source, std::error_code &ec)
   if (Name == "/" || Name == "/SYM64/") {
     // We know that the symbol table is not an external file, so we just assert
     // there is no error.
-    SymbolTable = *i->getBuffer();
+    SymbolTable = *C->getBuffer();
     if (Name == "/SYM64/")
       has64SymTable = true;
 
-    ++i;
-    if (i == e) {
+    if (Increment())
+      return;
+    if (I == E) {
       ec = std::error_code();
       return;
     }
-    Name = i->getRawName();
+    Name = C->getRawName();
   }
 
   if (Name == "//") {
     Format = has64SymTable ? K_MIPS64 : K_GNU;
     // The string table is never an external member, so we just assert on the
     // ErrorOr.
-    StringTable = *i->getBuffer();
-    ++i;
-    setFirstRegular(*i);
+    StringTable = *C->getBuffer();
+    if (Increment())
+      return;
+    setFirstRegular(*C);
     ec = std::error_code();
     return;
   }
 
   if (Name[0] != '/') {
     Format = has64SymTable ? K_MIPS64 : K_GNU;
-    setFirstRegular(*i);
+    setFirstRegular(*C);
     ec = std::error_code();
     return;
   }
@@ -360,25 +392,28 @@ Archive::Archive(MemoryBufferRef Source, std::error_code &ec)
   Format = K_COFF;
   // We know that the symbol table is not an external file, so we just assert
   // there is no error.
-  SymbolTable = *i->getBuffer();
+  SymbolTable = *C->getBuffer();
+
+  if (Increment())
+    return;
 
-  ++i;
-  if (i == e) {
-    setFirstRegular(*i);
+  if (I == E) {
+    setFirstRegular(*C);
     ec = std::error_code();
     return;
   }
 
-  Name = i->getRawName();
+  Name = C->getRawName();
 
   if (Name == "//") {
     // The string table is never an external member, so we just assert on the
     // ErrorOr.
-    StringTable = *i->getBuffer();
-    ++i;
+    StringTable = *C->getBuffer();
+    if (Increment())
+      return;
   }
 
-  setFirstRegular(*i);
+  setFirstRegular(*C);
   ec = std::error_code();
 }
 
@@ -390,12 +425,15 @@ Archive::child_iterator Archive::child_begin(bool SkipInternal) const {
     return Child(this, FirstRegularData, FirstRegularStartOfFile);
 
   const char *Loc = Data.getBufferStart() + strlen(Magic);
-  Child c(this, Loc);
-  return c;
+  std::error_code EC;
+  Child c(this, Loc, &EC);
+  if (EC)
+    return child_iterator(EC);
+  return child_iterator(c);
 }
 
 Archive::child_iterator Archive::child_end() const {
-  return Child(this, nullptr);
+  return Child(this, nullptr, nullptr);
 }
 
 StringRef Archive::Symbol::getName() const {
@@ -447,7 +485,11 @@ ErrorOr<Archive::child_iterator> Archive::Symbol::getMember() const {
   }
 
   const char *Loc = Parent->getData().begin() + Offset;
-  child_iterator Iter(Child(Parent, Loc));
+  std::error_code EC;
+  Child c(Parent, Loc, &EC);
+  if (EC)
+    return EC;
+  child_iterator Iter(c);
   return Iter;
 }
 
diff --git a/lib/Object/ArchiveWriter.cpp b/lib/Object/ArchiveWriter.cpp
index b68f7ef..f207dfb 100644
--- a/lib/Object/ArchiveWriter.cpp
+++ b/lib/Object/ArchiveWriter.cpp
@@ -39,7 +39,7 @@ NewArchiveIterator::NewArchiveIterator(const object::Archive::Child &OldMember,
     : IsNewMember(false), Name(Name), OldMember(OldMember) {}
 
 NewArchiveIterator::NewArchiveIterator(StringRef FileName)
-    : IsNewMember(true), Name(FileName), OldMember(nullptr, nullptr) {}
+    : IsNewMember(true), Name(FileName), OldMember(nullptr, nullptr, nullptr) {}
 
 StringRef NewArchiveIterator::getName() const { return Name; }
 
@@ -412,8 +412,11 @@ llvm::writeArchive(StringRef ArcName,
                         Status.getSize());
     } else {
       const object::Archive::Child &OldMember = I.getOld();
+      ErrorOr<uint32_t> Size = OldMember.getSize();
+      if (std::error_code EC = Size.getError())
+        return std::make_pair("", EC);
       printMemberHeader(Out, Kind, Thin, I.getName(), StringMapIndexIter,
-                        ModTime, UID, GID, Perms, OldMember.getSize());
+                        ModTime, UID, GID, Perms, Size.get());
     }
 
     if (!Thin)
diff --git a/test/tools/llvm-objdump/Inputs/libbogus1.a b/test/tools/llvm-objdump/Inputs/libbogus1.a
new file mode 100644
index 0000000..510c145
--- /dev/null
+++ b/test/tools/llvm-objdump/Inputs/libbogus1.a
@@ -0,0 +1,13 @@
+!<arch>
+hello.c         1444941273  124   0     100644  10%       `
+#include <stdio.h>
+#include <stdlib.h>
+int
+main()
+{
+	printf("Hello World\n");
+	return EXIT_SUCCESS;
+}
+foo.c           1444941645  124   0     100644  1%        `
+void foo(void){}
+
diff --git a/test/tools/llvm-objdump/Inputs/libbogus2.a b/test/tools/llvm-objdump/Inputs/libbogus2.a
new file mode 100644
index 0000000..2ccb7f3
--- /dev/null
+++ b/test/tools/llvm-objdump/Inputs/libbogus2.a
@@ -0,0 +1,13 @@
+!<arch>
+hello.c         1444941273  124   0     100644  102       `
+#include <stdio.h>
+#include <stdlib.h>
+int
+main()
+{
+	printf("Hello World\n");
+	return EXIT_SUCCESS;
+}
+foo.c           1444941645  124   0     100644  1%        `
+void foo(void){}
+
diff --git a/test/tools/llvm-objdump/Inputs/libbogus3.a b/test/tools/llvm-objdump/Inputs/libbogus3.a
new file mode 100644
index 0000000..f15a732
--- /dev/null
+++ b/test/tools/llvm-objdump/Inputs/libbogus3.a
@@ -0,0 +1,16 @@
+!<arch>
+hello.c         1444941273  124   0     100644  102       `
+#include <stdio.h>
+#include <stdlib.h>
+int
+main()
+{
+	printf("Hello World\n");
+	return EXIT_SUCCESS;
+}
+foo.c           1444941645  124   0     100644  171       `
+void foo(void){}
+
+bar.c           1445026190  124   0     100644  17        `
+void foo(void){}
+
diff --git a/test/tools/llvm-objdump/malformed-archives.test b/test/tools/llvm-objdump/malformed-archives.test
new file mode 100644
index 0000000..b8ba312
--- /dev/null
+++ b/test/tools/llvm-objdump/malformed-archives.test
@@ -0,0 +1,20 @@
+// These test checks that llvm-objdump will not crash with malformed Archive
+// files.  So the check line is not all that important but the bug fixes to
+// make sure llvm-objdump is robust is what matters.
+# RUN: llvm-objdump -macho -archive-headers \
+# RUN:   %p/Inputs/libbogus1.a \
+# RUN:   2>&1 | FileCheck -check-prefix=bogus1 %s 
+
+# bogus1: Invalid data was encountered while parsing the file
+
+# RUN: llvm-objdump -macho -archive-headers \
+# RUN:   %p/Inputs/libbogus2.a \
+# RUN:   2>&1 | FileCheck -check-prefix=bogus2 %s 
+
+# bogus2: hello.c
+
+# RUN: llvm-objdump -macho -archive-headers \
+# RUN:   %p/Inputs/libbogus3.a \
+# RUN:   2>&1 | FileCheck -check-prefix=bogus3 %s 
+
+# bogus3: foo.c
diff --git a/tools/dsymutil/BinaryHolder.cpp b/tools/dsymutil/BinaryHolder.cpp
index 7ff4fd6..01a49c9 100644
--- a/tools/dsymutil/BinaryHolder.cpp
+++ b/tools/dsymutil/BinaryHolder.cpp
@@ -109,7 +109,10 @@ BinaryHolder::GetArchiveMemberBuffers(StringRef Filename,
   Buffers.reserve(CurrentArchives.size());
 
   for (const auto &CurrentArchive : CurrentArchives) {
-    for (const auto &Child : CurrentArchive->children()) {
+    for (auto ChildOrErr : CurrentArchive->children()) {
+      if (auto Err = ChildOrErr.getError())
+        return Err;
+      const auto &Child = *ChildOrErr;
       if (auto NameOrErr = Child.getName()) {
         if (*NameOrErr == Filename) {
           if (Timestamp != sys::TimeValue::PosixZeroTime() &&
diff --git a/tools/llvm-ar/llvm-ar.cpp b/tools/llvm-ar/llvm-ar.cpp
index f0e7284..d83d61b 100644
--- a/tools/llvm-ar/llvm-ar.cpp
+++ b/tools/llvm-ar/llvm-ar.cpp
@@ -338,7 +338,11 @@ static void doDisplayTable(StringRef Name, const object::Archive::Child &C) {
     printMode(Mode & 007);
     outs() << ' ' << C.getUID();
     outs() << '/' << C.getGID();
-    outs() << ' ' << format("%6llu", C.getSize());
+    ErrorOr<uint64_t> Size = C.getSize();
+    if (Size.getError())
+      outs() << ' ' << "bad size";
+    else
+      outs() << ' ' << format("%6llu", Size.get());
     outs() << ' ' << C.getLastModified().str();
     outs() << ' ';
   }
@@ -403,7 +407,14 @@ static void performReadOperation(ArchiveOperation Operation,
   }
 
   bool Filter = !Members.empty();
-  for (const object::Archive::Child &C : OldArchive->children()) {
+  for (auto &ChildOrErr : OldArchive->children()) {
+    if (ChildOrErr.getError()) {
+      errs() << ToolName << ": error reading '" << ArchiveName
+             << "': " << ChildOrErr.getError().message() << "!\n";
+      return;
+    }
+    const object::Archive::Child &C = *ChildOrErr;
+
     ErrorOr<StringRef> NameOrErr = C.getName();
     failIfError(NameOrErr.getError());
     StringRef Name = NameOrErr.get();
@@ -523,7 +534,9 @@ computeNewArchiveMembers(ArchiveOperation Operation,
   int InsertPos = -1;
   StringRef PosName = sys::path::filename(RelPos);
   if (OldArchive) {
-    for (auto &Child : OldArchive->children()) {
+    for (auto &ChildOrErr : OldArchive->children()) {
+      failIfError(ChildOrErr.getError());
+      auto &Child = ChildOrErr.get();
       int Pos = Ret.size();
       ErrorOr<StringRef> NameOrErr = Child.getName();
       failIfError(NameOrErr.getError());
@@ -726,7 +739,9 @@ static void runMRIScript() {
       failIfError(LibOrErr.getError(), "Could not parse library");
       Archives.push_back(std::move(*LibOrErr));
       object::Archive &Lib = *Archives.back();
-      for (auto &Member : Lib.children()) {
+      for (auto &MemberOrErr : Lib.children()) {
+        failIfError(MemberOrErr.getError());
+        auto &Member = MemberOrErr.get();
         ErrorOr<StringRef> NameOrErr = Member.getName();
         failIfError(NameOrErr.getError());
         addMember(NewMembers, Member, *NameOrErr);
diff --git a/tools/llvm-cxxdump/llvm-cxxdump.cpp b/tools/llvm-cxxdump/llvm-cxxdump.cpp
index d45a28a..96526c4 100644
--- a/tools/llvm-cxxdump/llvm-cxxdump.cpp
+++ b/tools/llvm-cxxdump/llvm-cxxdump.cpp
@@ -482,7 +482,12 @@ static void dumpCXXData(const ObjectFile *Obj) {
 }
 
 static void dumpArchive(const Archive *Arc) {
-  for (const Archive::Child &ArcC : Arc->children()) {
+  for (auto &ErrorOrChild : Arc->children()) {
+    if (std::error_code EC = ErrorOrChild.getError()) {
+      reportError(Arc->getFileName(), EC.message());
+      break;
+    }
+    const Archive::Child &ArcC = *ErrorOrChild;
     ErrorOr<std::unique_ptr<Binary>> ChildOrErr = ArcC.getAsBinary();
     if (std::error_code EC = ChildOrErr.getError()) {
       // Ignore non-object files.
diff --git a/tools/llvm-nm/llvm-nm.cpp b/tools/llvm-nm/llvm-nm.cpp
index 9d0cc57..5fb3363 100644
--- a/tools/llvm-nm/llvm-nm.cpp
+++ b/tools/llvm-nm/llvm-nm.cpp
@@ -948,10 +948,11 @@ static void dumpSymbolNamesFromFile(std::string &Filename) {
       if (I != E) {
         outs() << "Archive map\n";
         for (; I != E; ++I) {
-          ErrorOr<Archive::child_iterator> C = I->getMember();
-          if (error(C.getError()))
+          ErrorOr<Archive::child_iterator> ErrorOrChild = I->getMember();
+          if (error(ErrorOrChild.getError()))
             return;
-          ErrorOr<StringRef> FileNameOrErr = C.get()->getName();
+          auto &C = *(ErrorOrChild.get());
+          ErrorOr<StringRef> FileNameOrErr = C.get().getName();
           if (error(FileNameOrErr.getError()))
             return;
           StringRef SymName = I->getName();
@@ -963,7 +964,10 @@ static void dumpSymbolNamesFromFile(std::string &Filename) {
 
     for (Archive::child_iterator I = A->child_begin(), E = A->child_end();
          I != E; ++I) {
-      ErrorOr<std::unique_ptr<Binary>> ChildOrErr = I->getAsBinary(&Context);
+      if (I->getError())
+        break;
+      auto &C = I->get();
+      ErrorOr<std::unique_ptr<Binary>> ChildOrErr = C.getAsBinary(&Context);
       if (ChildOrErr.getError())
         continue;
       if (SymbolicFile *O = dyn_cast<SymbolicFile>(&*ChildOrErr.get())) {
@@ -1018,8 +1022,11 @@ static void dumpSymbolNamesFromFile(std::string &Filename) {
               for (Archive::child_iterator AI = A->child_begin(),
                                            AE = A->child_end();
                    AI != AE; ++AI) {
+                if (AI->getError())
+                  break;
+                auto &C = AI->get();
                 ErrorOr<std::unique_ptr<Binary>> ChildOrErr =
-                    AI->getAsBinary(&Context);
+                    C.getAsBinary(&Context);
                 if (ChildOrErr.getError())
                   continue;
                 if (SymbolicFile *O =
@@ -1072,8 +1079,11 @@ static void dumpSymbolNamesFromFile(std::string &Filename) {
             for (Archive::child_iterator AI = A->child_begin(),
                                          AE = A->child_end();
                  AI != AE; ++AI) {
+              if (AI->getError())
+                break;
+              auto &C = AI->get();
               ErrorOr<std::unique_ptr<Binary>> ChildOrErr =
-                  AI->getAsBinary(&Context);
+                  C.getAsBinary(&Context);
               if (ChildOrErr.getError())
                 continue;
               if (SymbolicFile *O =
@@ -1121,8 +1131,10 @@ static void dumpSymbolNamesFromFile(std::string &Filename) {
         std::unique_ptr<Archive> &A = *AOrErr;
         for (Archive::child_iterator AI = A->child_begin(), AE = A->child_end();
              AI != AE; ++AI) {
-          ErrorOr<std::unique_ptr<Binary>> ChildOrErr =
-              AI->getAsBinary(&Context);
+          if (AI->getError())
+            continue;
+          auto &C = AI->get();
+          ErrorOr<std::unique_ptr<Binary>> ChildOrErr = C.getAsBinary(&Context);
           if (ChildOrErr.getError())
             continue;
           if (SymbolicFile *O = dyn_cast<SymbolicFile>(&*ChildOrErr.get())) {
diff --git a/tools/llvm-objdump/MachODump.cpp b/tools/llvm-objdump/MachODump.cpp
index 9e31bf2..a9ef38c 100644
--- a/tools/llvm-objdump/MachODump.cpp
+++ b/tools/llvm-objdump/MachODump.cpp
@@ -1417,8 +1417,12 @@ static void printArchiveChild(const Archive::Child &C, bool verbose,
   outs() << format("%3d/", UID);
   unsigned GID = C.getGID();
   outs() << format("%-3d ", GID);
-  uint64_t Size = C.getRawSize();
-  outs() << format("%5" PRId64, Size) << " ";
+  ErrorOr<uint64_t> Size = C.getRawSize();
+  if (Size.getError())
+    outs() << "bad size"
+           << " ";
+  else
+    outs() << format("%5" PRId64, Size.get()) << " ";
 
   StringRef RawLastModified = C.getRawLastModified();
   if (verbose) {
@@ -1454,7 +1458,9 @@ static void printArchiveChild(const Archive::Child &C, bool verbose,
 static void printArchiveHeaders(Archive *A, bool verbose, bool print_offset) {
   for (Archive::child_iterator I = A->child_begin(false), E = A->child_end();
        I != E; ++I) {
-    Archive::Child C = *I;
+    if (I->getError())
+      break;
+    const Archive::Child &C = **I;
     printArchiveChild(C, verbose, print_offset);
   }
 }
@@ -1491,7 +1497,10 @@ void llvm::ParseInputMachO(StringRef Filename) {
       printArchiveHeaders(A, !NonVerbose, ArchiveMemberOffsets);
     for (Archive::child_iterator I = A->child_begin(), E = A->child_end();
          I != E; ++I) {
-      ErrorOr<std::unique_ptr<Binary>> ChildOrErr = I->getAsBinary();
+      if (I->getError())
+        break;
+      auto &C = I->get();
+      ErrorOr<std::unique_ptr<Binary>> ChildOrErr = C.getAsBinary();
       if (ChildOrErr.getError())
         continue;
       if (MachOObjectFile *O = dyn_cast<MachOObjectFile>(&*ChildOrErr.get())) {
@@ -1539,7 +1548,10 @@ void llvm::ParseInputMachO(StringRef Filename) {
               for (Archive::child_iterator AI = A->child_begin(),
                                            AE = A->child_end();
                    AI != AE; ++AI) {
-                ErrorOr<std::unique_ptr<Binary>> ChildOrErr = AI->getAsBinary();
+                if (AI->getError())
+                  break;
+                auto &C = AI->get();
+                ErrorOr<std::unique_ptr<Binary>> ChildOrErr = C.getAsBinary();
                 if (ChildOrErr.getError())
                   continue;
                 if (MachOObjectFile *O =
@@ -1581,7 +1593,10 @@ void llvm::ParseInputMachO(StringRef Filename) {
             for (Archive::child_iterator AI = A->child_begin(),
                                          AE = A->child_end();
                  AI != AE; ++AI) {
-              ErrorOr<std::unique_ptr<Binary>> ChildOrErr = AI->getAsBinary();
+              if (AI->getError())
+                break;
+              auto &C = AI->get();
+              ErrorOr<std::unique_ptr<Binary>> ChildOrErr = C.getAsBinary();
               if (ChildOrErr.getError())
                 continue;
               if (MachOObjectFile *O =
@@ -1617,7 +1632,10 @@ void llvm::ParseInputMachO(StringRef Filename) {
           printArchiveHeaders(A.get(), !NonVerbose, ArchiveMemberOffsets);
         for (Archive::child_iterator AI = A->child_begin(), AE = A->child_end();
              AI != AE; ++AI) {
-          ErrorOr<std::unique_ptr<Binary>> ChildOrErr = AI->getAsBinary();
+          if (AI->getError())
+            break;
+          auto &C = AI->get();
+          ErrorOr<std::unique_ptr<Binary>> ChildOrErr = C.getAsBinary();
           if (ChildOrErr.getError())
             continue;
           if (MachOObjectFile *O =
diff --git a/tools/llvm-objdump/llvm-objdump.cpp b/tools/llvm-objdump/llvm-objdump.cpp
index 7292841..3440193 100644
--- a/tools/llvm-objdump/llvm-objdump.cpp
+++ b/tools/llvm-objdump/llvm-objdump.cpp
@@ -1536,7 +1536,12 @@ static void DumpObject(const ObjectFile *o) {
 
 /// @brief Dump each object file in \a a;
 static void DumpArchive(const Archive *a) {
-  for (const Archive::Child &C : a->children()) {
+  for (auto &ErrorOrChild : a->children()) {
+    if (std::error_code EC = ErrorOrChild.getError()) {
+      report_error(a->getFileName(), EC);
+      break;
+    }
+    const Archive::Child &C = *ErrorOrChild;
     ErrorOr<std::unique_ptr<Binary>> ChildOrErr = C.getAsBinary();
     if (std::error_code EC = ChildOrErr.getError())
       if (EC != object_error::invalid_file_type)
diff --git a/tools/llvm-readobj/llvm-readobj.cpp b/tools/llvm-readobj/llvm-readobj.cpp
index cb0c9c6..b59579d 100644
--- a/tools/llvm-readobj/llvm-readobj.cpp
+++ b/tools/llvm-readobj/llvm-readobj.cpp
@@ -377,7 +377,12 @@ static void dumpObject(const ObjectFile *Obj) {
 
 /// @brief Dumps each object file in \a Arc;
 static void dumpArchive(const Archive *Arc) {
-  for (const auto &Child : Arc->children()) {
+  for (auto &ErrorOrChild : Arc->children()) {
+    if (std::error_code EC = ErrorOrChild.getError()) {
+      reportError(Arc->getFileName(), EC.message());
+      break;
+    }
+    const auto &Child = *ErrorOrChild;
     ErrorOr<std::unique_ptr<Binary>> ChildOrErr = Child.getAsBinary();
     if (std::error_code EC = ChildOrErr.getError()) {
       // Ignore non-object files.
diff --git a/tools/llvm-size/llvm-size.cpp b/tools/llvm-size/llvm-size.cpp
index de2b045..2695140 100644
--- a/tools/llvm-size/llvm-size.cpp
+++ b/tools/llvm-size/llvm-size.cpp
@@ -427,7 +427,13 @@ static void PrintFileSectionSizes(StringRef file) {
     for (object::Archive::child_iterator i = a->child_begin(),
                                          e = a->child_end();
          i != e; ++i) {
-      ErrorOr<std::unique_ptr<Binary>> ChildOrErr = i->getAsBinary();
+      if (i->getError()) {
+        errs() << ToolName << ": " << file << ": " << i->getError().message()
+               << ".\n";
+        break;
+      }
+      auto &c = i->get();
+      ErrorOr<std::unique_ptr<Binary>> ChildOrErr = c.getAsBinary();
       if (std::error_code EC = ChildOrErr.getError()) {
         errs() << ToolName << ": " << file << ": " << EC.message() << ".\n";
         continue;
@@ -489,7 +495,13 @@ static void PrintFileSectionSizes(StringRef file) {
               for (object::Archive::child_iterator i = UA->child_begin(),
                                                    e = UA->child_end();
                    i != e; ++i) {
-                ErrorOr<std::unique_ptr<Binary>> ChildOrErr = i->getAsBinary();
+                if (std::error_code EC = i->getError()) {
+                  errs() << ToolName << ": " << file << ": " << EC.message()
+                         << ".\n";
+                  break;
+                }
+                auto &c = i->get();
+                ErrorOr<std::unique_ptr<Binary>> ChildOrErr = c.getAsBinary();
                 if (std::error_code EC = ChildOrErr.getError()) {
                   errs() << ToolName << ": " << file << ": " << EC.message()
                          << ".\n";
@@ -566,7 +578,13 @@ static void PrintFileSectionSizes(StringRef file) {
             for (object::Archive::child_iterator i = UA->child_begin(),
                                                  e = UA->child_end();
                  i != e; ++i) {
-              ErrorOr<std::unique_ptr<Binary>> ChildOrErr = i->getAsBinary();
+              if (std::error_code EC = i->getError()) {
+                errs() << ToolName << ": " << file << ": " << EC.message()
+                       << ".\n";
+                break;
+              }
+              auto &c = i->get();
+              ErrorOr<std::unique_ptr<Binary>> ChildOrErr = c.getAsBinary();
               if (std::error_code EC = ChildOrErr.getError()) {
                 errs() << ToolName << ": " << file << ": " << EC.message()
                        << ".\n";
@@ -630,7 +648,12 @@ static void PrintFileSectionSizes(StringRef file) {
         for (object::Archive::child_iterator i = UA->child_begin(),
                                              e = UA->child_end();
              i != e; ++i) {
-          ErrorOr<std::unique_ptr<Binary>> ChildOrErr = i->getAsBinary();
+          if (std::error_code EC = i->getError()) {
+            errs() << ToolName << ": " << file << ": " << EC.message() << ".\n";
+            break;
+          }
+          auto &c = i->get();
+          ErrorOr<std::unique_ptr<Binary>> ChildOrErr = c.getAsBinary();
           if (std::error_code EC = ChildOrErr.getError()) {
             errs() << ToolName << ": " << file << ": " << EC.message() << ".\n";
             continue;


More information about the llvm-commits mailing list