[PATCH] D13989: This removes the eating of the error in Archive::Child::getSize() when the charactersin the size field in the archive header for the member is not a number.
Kevin Enderby via llvm-commits
llvm-commits at lists.llvm.org
Mon Nov 2 12:41:01 PST 2015
Hi Rafael,
Again thanks for all your help on this and willingness to take a second look and the option of idea of the iterator having an ErrorOr<Child>.
I have updated your diffs with changes to the issues you noted and they are here:
-------------- next part --------------
A non-text attachment was scrubbed...
Name: updated_t.diff
Type: application/octet-stream
Size: 31514 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20151102/51991faa/attachment.obj>
-------------- next part --------------
I think the way forward at this point is to abandon the existing patch and start again with new patch based on the diffs above.
And do have a few comments in line below on some of the specifics.
Thanks again,
Kev
> On Nov 2, 2015, at 5:38 AM, Rafael Espíndola <rafael.espindola at gmail.com> wrote:
>
> I have changed code that was using archive_iterator in cases where
> end() (or error) was not possible to use Child.
>
> This reduced the impact of this change. An rebased patch is attached.
>
> In addition to the previous issue, I noticed that we use 'auto' in
> case the style guide recommends against:
>
> if (auto Err = ChildOrErr.getError())
>
> should be just std::error_code.
>
Done.
>
> - outs() << ' ' << format("%6llu", C.getSize());
> + ErrorOr<uint64_t> Size = C.getSize();
> + if (Size.getError())
> + outs() << ' ' << "bad size";
>
> It would be much better for llvm-ar (and any tool IMHO) to fail when
> finding a corrupted file. In the case of llvm-ar, just call fail.
I would say it depends on the tool and what the tool is doing.
In the case above the llvm-ar tool is printing out the archive header. Since I often run tools on malformed files during development I like tools to show be what is there rather than just calling “fail” and bailing out in my opinion. In this case what I really want is to see the characters of the archive header’s field for Size as during development I likely messed things up and need to see what is there so I can fix things.
But to do this I would need to use the getHeader() method then print the Size[10] characters in this case. And this really should be done for all the fields. Which would mean making all the methods like getUID(), getGID(), etc, all return ErrorOr<> like getSize(). But that seams like changes that should be left for another patch.
>
>
> Cheers,
> Rafael
>
>
> On 31 October 2015 at 19:08, Rafael Espíndola
> <rafael.espindola at gmail.com> wrote:
>> The main problem with the patch was making Archive.cpp more
>> complicated. I have fixed that by not storing child_iterators in the
>> Archive.
>>
>> Now I think the main remaining issues are cases like
>>
>> - if (ChildIt != A->child_end()) {
>> + if (*ChildIt && ChildIt != A->child_end()) {
>>
>> This silently ignores the error. To keep this patch small, I would
>> suggest that for now this be changed to
>>
>> If (std::error_code EC = ChildIt->getError())
>> report_fatal_error(EC.message());
>> if (ChildIt != A->child_end()) {
>> …
OK, I have made these changes.
>>
>> A followup patch can propagate the error.
I agree with you on that but I think the two places that needed this change need to deal better with errors. As they are in the Jit and should not call report_fatal_error() with malformed input. But I think it best for Lang or others to clean up the error handling and work to eliminate the use of report_fatal_error() for malformed input.
>>
>> I have attached a patch rebased on trunk that stores an ErrorOr<Child>
>> in the iterator.
Got it and my diffs above were based on that.
>>
>> Cheers,
>> Rafael
>>
>>
>>
>>
>> On 31 October 2015 at 10:14, Rafael Espíndola
>> <rafael.espindola at gmail.com> wrote:
>>> OK, I am starting to lean a bit more on the idea of the iterator
>>> having an ErrorOr<Child>. I will take a second look at that option.
>>>
>>> Sorry for the detour.
>>>
>>> Cheers,
>>> Rafael
>>>
>>>
>>> On 29 October 2015 at 22:53, Lang Hames via llvm-commits
>>> <llvm-commits at lists.llvm.org> wrote:
>>>> That's a good point: next() has to return an ErrorOr<iterator>, and you have
>>>> to distinguish between failure and end(), so if you want to check each call
>>>> to next() you need to do something like:
>>>>
>>>> auto ErrOrItr = File->error_or_child_begin();
>>>> if (!ErrOrItr)
>>>> return ErrOrItr.getError();
>>>> auto Itr = *ErrOrItr;
>>>> while (Itr != file->child_end()) {
>>>> auto &Thing = *Itr;
>>>>
>>>> ErrOrItr = increment(Itr);
>>>> if (!ErrOrItr)
>>>> return ErrOrItr.getError();
>>>> }
>>>>
>>>> If you write an iterator wrapper that includes a failure state you can get
>>>> that down to:
>>>>
>>>> auto ErrOrItr = File->child_begin();
>>>> for (; ErrOrItr && ErrOrItr != File->child_end(); ErrOrItr.next()) {
>>>> Child = *ErrOrItr;
>>>> // ...
>>>> }
>>>> if (!ErrOrItr)
>>>> return ErrOrItr.getError();
>>>>
>>>> But now you have to be careful to check the error outside the loop.
>>>>
>>>> Looking at Kevin's updated diff, the change to the iterators made those
>>>> loops a lot uglier. The introduction of the Increment function fixes some of
>>>> that, but at the cost of risking the problem that you were trying to avoid:
>>>> It's suddenly very easy to "increment" the iterator without actually
>>>> checking the error.
>>>>
>>>> - Lang.
>>>>
>>>>
>>>> On Thu, Oct 29, 2015 at 8:30 PM, Kevin Enderby <enderby at apple.com> wrote:
>>>>>
>>>>> Hi Lang and others,
>>>>>
>>>>> This iterator basically requires another indirection to get to the
>>>>> underlying Child. So you get a compile error when trying to use the
>>>>> iterator as it is an <ErrorOr>. So the example really is to change
>>>>> something like this (as the diff in ELF/InputFiles.cpp):
>>>>>
>>>>> for (const Archive::Child &Child : File->children()) {
>>>>> …
>>>>>
>>>>> to this:
>>>>>
>>>>> for (auto &ChildOrErr : File->children()) {
>>>>> error(ChildOrErr, "Could not get the child of the archive " +
>>>>> File->getFileName());
>>>>> const Archive::Child Child(*ChildOrErr);
>>>>> ...
>>>>>
>>>>> Which seems a bit cleaner than testing the result of child_begin() before
>>>>> the for loop and checking getNext() even if it can be wrapped in some case
>>>>> in and Increment() routine.
>>>>>
>>>>> My thoughts,
>>>>> Kev
>>>>>
>>>>> On Oct 29, 2015, at 7:50 PM, Lang Hames via llvm-commits
>>>>> <llvm-commits at lists.llvm.org> wrote:
>>>>>
>>>>> Yep. The standard C++ iterator concept doesn't apply here, since it
>>>>> doesn't deal with failure, and checking on deref rather than increment means
>>>>> you can accidentally do multiple increments without checking. My intent was
>>>>> to have increment crash if you run it on an error'd out iterator, but that's
>>>>> not as good as a compile-time check. I think it's a matter of personal taste
>>>>> whether the cost (IMHO fairly minimal, since in practice you almost always
>>>>> deref after increment) is worth paying for the gain (also arguably minimal)
>>>>> of a neater iteration idiom.
>>>>>
>>>>> I don't have particularly strong feelings either way.
>>>>>
>>>>> - Lang.
>>>>>
>>>>> On Thu, Oct 29, 2015 at 5:51 PM, David Blaikie <dblaikie at gmail.com> wrote:
>>>>>>
>>>>>>
>>>>>>
>>>>>> On Thu, Oct 29, 2015 at 5:27 PM, Lang Hames via llvm-commits
>>>>>> <llvm-commits at lists.llvm.org> wrote:
>>>>>>>
>>>>>>> Hi Rafael,
>>>>>>>
>>>>>>> How could you ignore the error in Kevin's solution?
>>>>>>>
>>>>>>> I think your solution and his are essentially the same, except that you
>>>>>>> check the error on increment and he checks it on dereference. In practice
>>>>>>> you always have to do both operations (you can't do multiple increments
>>>>>>> without checking the error anyway, and you can just do that by deref'ing),
>>>>>>> and checking via deref allows you to used range-based for, which makes the
>>>>>>> code neater.
>>>>>>
>>>>>>
>>>>>> I'm not taking a position either way - but that does present a difficult
>>>>>> problem for the iterator contract, to say that you can't increment it twice
>>>>>> without dereferencing... that's not really in the iterator playbook/contract
>>>>>> (could implement a little extra support for that by allowing an error-state
>>>>>> iterator to be incremented to become the end iterator). I know you've
>>>>>> mentioned it before (the "can't increment twice without checking") but it
>>>>>> became a bit clearer to me in this recent email exchange to the point that I
>>>>>> would be a little less comfortable with it.
>>>>>>
>>>>>> I can't find the example you guys had:
>>>>>>
>>>>>> for (auto &Thing : things) {
>>>>>> if (!Thing)
>>>>>> return Thing.getError();
>>>>>> ...
>>>>>> }
>>>>>>
>>>>>> & yeah, nothing that comes to mind is as tidy/simple as that, my nearest
>>>>>> might be:
>>>>>>
>>>>>> auto Iter = things.begin();
>>>>>> while (auto &Thing = Iter.Next()) {
>>>>>> ....
>>>>>> }
>>>>>> if (!Iter)
>>>>>> return Iter.getError();
>>>>>>
>>>>>> (where "Iter" isn't a traditional iterator in any sense - it has
>>>>>> ErrorOr<T> Next() and getError (& possibly boolean testability))
>>>>>>
>>>>>> The obvious sort of problem I imagine arising from the iterator/range API
>>>>>> is something like:
>>>>>>
>>>>>> int count = std::distance(things.begin(), things.end());
>>>>>>
>>>>>> Easy to write, and would produce an infinite loop (or crash?) if there
>>>>>> was an error during iteration.
>>>>>>
>>>>>> - Dave
>>>>>>
>>>>>> - Dave
>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> Cheers,
>>>>>>> Lang.
>>>>>>>
>>>>>>> On Thu, Oct 29, 2015 at 5:11 PM, Rafael Espíndola
>>>>>>> <rafael.espindola at gmail.com> wrote:
>>>>>>>>
>>>>>>>> On 28 October 2015 at 15:21, Kevin Enderby <enderby at apple.com> wrote:
>>>>>>>>> enderby added a comment.
>>>>>>>>>
>>>>>>>>> I agree with Lang and like the original patch where were were
>>>>>>>>> handling failure was done inside the iterator. It does allow use of
>>>>>>>>> ranged-based for loops and appears to me much cleaner in the code especially
>>>>>>>>> in the lld changes needed in http://reviews.llvm.org/D13990 .
>>>>>>>>
>>>>>>>> I think that the most important consideration is making the error hard
>>>>>>>> to ignore. That means using ErrorOr.
>>>>>>>>
>>>>>>>> The sad reality is that the archive format can cause us to find an
>>>>>>>> error when going from one member to the next. That, with the above
>>>>>>>> requirement means that it doesn't fit on range loop.
>>>>>>>>
>>>>>>>>> Rafael, I did "finish it up" based on your t.diff patch and that is
>>>>>>>>> below and includes the re-formatting with clang-format-diff, removal of the
>>>>>>>>> malformed-archives directory putting the .a files in Inputs and removal of
>>>>>>>>> the "nice but independent" changes.
>>>>>>>>>
>>>>>>>>> F1026846: llvm_updated_t.diff <http://reviews.llvm.org/F1026846>
>>>>>>>>
>>>>>>>>
>>>>>>>> Thanks!
>>>>>>>>
>>>>>>>> You don't need to change ErrorOr.h.
>>>>>>>>
>>>>>>>> The change to detect the end of the archive seems more complicated
>>>>>>>> than necessary (see the attached patch for a suggestion).
>>>>>>>>
>>>>>>>> Cases like the Increment in BinaryHolder expose a naked EC, which is
>>>>>>>> what using ErrorOr is trying to avoid. Just inlining it solves the
>>>>>>>> problem (see patch).
>>>>>>>>
>>>>>>>> Please convert the remaining user of Increment to return void when
>>>>>>>> possible.
>>>>>>>>
>>>>>>>> Cheers,
>>>>>>>> Rafael
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> _______________________________________________
>>>>>>> llvm-commits mailing list
>>>>>>> llvm-commits at lists.llvm.org
>>>>>>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-commits
>>>>>>>
>>>>>>
>>>>>
>>>>> _______________________________________________
>>>>> llvm-commits mailing list
>>>>> llvm-commits at lists.llvm.org
>>>>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-commits
>>>>>
>>>>>
>>>>
>>>>
>>>> _______________________________________________
>>>> llvm-commits mailing list
>>>> llvm-commits at lists.llvm.org
>>>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-commits
>>>>
> <t.diff>
More information about the llvm-commits
mailing list