[llvm-dev] Need help with code generation

Wed Mar 23 09:11:22 PDT 2016

On Wed, Mar 23, 2016 at 3:52 AM, Rui Ueyama <ruiu at google.com> wrote:

> On Tue, Mar 22, 2016 at 10:33 PM, David Blaikie <dblaikie at gmail.com>
> wrote:
>
>>
>>
>> On Tue, Mar 22, 2016 at 1:29 PM, Rui Ueyama <ruiu at google.com> wrote:
>>
>>> On Tue, Mar 22, 2016 at 9:19 PM, David Blaikie <dblaikie at gmail.com>
>>> wrote:
>>>
>>>>
>>>>
>>>> On Tue, Mar 22, 2016 at 1:15 PM, Rui Ueyama <ruiu at google.com> wrote:
>>>>
>>>>> On Tue, Mar 22, 2016 at 9:00 PM, David Blaikie <dblaikie at gmail.com>
>>>>> wrote:
>>>>>
>>>>>>
>>>>>>
>>>>>> On Tue, Mar 22, 2016 at 12:36 PM, Rui Ueyama <ruiu at google.com> wrote:
>>>>>>
>>>>>>> On Tue, Mar 22, 2016 at 7:36 PM, Rui Ueyama <ruiu at google.com> wrote:
>>>>>>>
>>>>>>>> I have a question. If there is a ELF verifier function that walks
>>>>>>>> every part of an ELF file to verify that the file is sane, and if you can
>>>>>>>> call that before calling LLD's function, are you guys happy with that?
>>>>>>>>
>>>>>>>
>>>>>>> I'd like to get you guys opinion on this question.
>>>>>>>
>>>>>>
>>>>>> I'd still find it problematic that lld itself would consider
>>>>>> crash-on-invalid "not a bug" to the point of not reviewing/approving
>>>>>> patches to fix such issues. That's what I'm concerned about in this thread.
>>>>>>
>>>>>
>>>>> That's one way to see that. The other view is it as a whole has a
>>>>> boolean option *IsInputTrustworthy* and works accordingly. What
>>>>> matters most is what we provide to the users as a guarantee.  You have an
>>>>> opinion that that should be implemented within LLD, but that would now an
>>>>> internal design choice. Hypothetically we had such pass to verify inputs,
>>>>> and if you send a patch to "fix" crash bug of LLD, we wouldn't probably
>>>>> reject that but instead argue that that needs to be addressed in the
>>>>> verifier pass instead. This is about "how" something should be implemented
>>>>> and usual design choice discussion, no?
>>>>>
>>>>
>>>> OK, sorry - some confusion. I assume that you wouldn't run this
>>>> verifier pass by default in lld-the-command-line-tool, right? (I would
>>>> guess it wouldn't meet your performance criteria)
>>>>
>>>
>>> Correct.
>>>
>>>
>>>> So from that perspective, lld-the-command-line-tool would still be
>>>> crashing-by-design on certain inputs in its default/normal/user-facing mode
>>>> & that would be seem problematic to me.
>>>>
>>>
>>> I disagree. You are in almost all case handling valid object files
>>> created by compilers, and if omitting some error checks for really pathetic
>>> case would make code simpler and improve performance, I think having that
>>> option is worth it (and I believe that's the case, at least those who are
>>> actually writing code seems to take that stance.) Also, if you give broken
>>> object files, you wouldn't get an output anyways. The only difference that
>>> a user can observe is whether it dies with an error message or not. We
>>> could even catch an segfault and run the verifier on the input again to
>>> print out an error after something goes wrong.
>>>
>>
>> Sure, I understand that we disagree here, I was merely answering your
>> question "I have a question. If there is a ELF verifier function that walks
>> every part of an ELF file to verify that the file is sane, and if you can
>> call that before calling LLD's function, are you guys happy with that?" to
>> help you understand that point of disagreement/my position (& probably the
>> position of other people on this thread)
>>
>
> Yes, I understand that this is where we disagree. LLD is robust in my
> definition and we are vigorously trying to make it so. To me, however, it
> is unfortunate but acceptable if LLD crashes on a malicious, hand-crafted
> object file which is intended to crash LLD if the cost of fixing it is too
> expensive. I understand a number of people are concerned about the design
> choices. If the LLVM foundation or whatever wants to say that all LLVM
> projects must have these minimum standards, and those standards include
> this design choice, we'd do that. But that does not seems to be the case
> right now.
>
> I'd be happy if we could handle all possible errors elegantly with very
> low overhead, but there seems to be no easy way to do that. We may not have
> just found it yet. If you come up with a solution and send us a patch, we
> can discuss that, but what is happening here is that a number of people who
> are thinking that our design choice is unreasonable are not contributing to
> the project, so we are repeating the same discussion.
>
> The reason why I asked that question is it is guaranteed that we can
> provide a protection for those who wants a 100% crash-free-ness (although I
> doubt about how effective it is from the user's point of view compared to
> other components which have crash bugs as discussed in the thread.) People
> who are in this thread seem to believe that the design choice is not
> irreversible or will reach to a point where it is irreversible because we
> will have written too much code with the design, so we won't be able to
> "fix" any crash "bugs" in future. That is not the case. We can at least
> provide a new pass anytime to rigorously check any user input. Whether it
> needs to be a separate pass or integrated one is a detailed design choice
> that needs a first-hand knowledge on the code base.
>

Except we all seem to agree that this pass would probably make lld rather
slow - possibly slower than gold/binutils ld, I assume? (perhaps that's an
incorrect assumption) so at that point I imagine people would just go back
to using those tools which, as Paul pointed out, do treat UB on invalid
input as bugs today.

> I also strongly disagree that it will become irreversible as we write more
> code. It is plainly wrong. We are doing large incremental refactoring
> pretty often (I'm the person who are doing it most often), and if you come
> up with a way to handle every possible errors elegantly with very low
> overhead, we could do that as well.
>

Indeed, we do do pretty large refactorings across the LLVM project
regularly - the position/argument is that building such things in from the
beginning may be relatively cheap compared to the refactoring cost of
adding things later. This is the difficult balance we all make on so many
design decisions across the LLVM project.

There's also a difficult balance between placing the cost when/where it is
needed (the person who wants a library perhaps should be the one paying the
engineering cost of making a library (assuming it was no more expensive to
do it when-needed than to build it in earlier, which is unclear)) or to lay
some groundwork to enable the possibility of new scenarios with relatively
low cost (Clang being a great example of this - building Clang as a library
was/is a core design goal that enables many features the original Clang
developers never could've foreseen - if we expected the first person to
write a syntax highlighter (or even the larger Clang Tooling/AST Matcher
infrastructure) to library-ify Clang, those efforts wouldn't've happened,
or would've happened by building a new compiler (one of the major reasons
Clang was built because those tools were hard to build with GCC))

>
> Given that, I think you are worrying too much about one design choice that
> I made. If you need a linker which never crash (whatever it means), then
> I'm sorry but LLD may not be your choice at least at this moment. It is
> however probably not suitable for your purpose anyways because it's too
> early to use -- we are energetically working on implementing missing
> features*. *If you have a suggestion to expand its scope without
> sacrificing usability in other fields, we'd be happy to discuss that.
>
> (UB is more than just segfaulting, though. So there are more possible
>> failures (including silently producing output & exiting with 0) than
>> segfault.)
>>
>> - David
>>
>>
>>>
>>> If you're proposing having this verifier run by default - sure, then I
>>>> can't construct an input that crashes the linker, it'll fail with an error
>>>> message an exit. Yes, that would be fine by me - the way the feature is
>>>> implemented is not something I mean to imply constraints on.
>>>>
>>>> - David
>>>>
>>>>
>>>>>
>>>>>
>>>>>>>
>>>>>>>> On Tue, Mar 22, 2016 at 6:39 PM, Hal Finkel via llvm-dev <
>>>>>>>> llvm-dev at lists.llvm.org> wrote:
>>>>>>>>
>>>>>>>>>
>>>>>>>>> ------------------------------
>>>>>>>>>
>>>>>>>>> *From: *"David Blaikie via llvm-dev" <llvm-dev at lists.llvm.org>
>>>>>>>>> *To: *"Rafael Espíndola" <rafael.espindola at gmail.com>
>>>>>>>>> *Cc: *"llvm-dev" <llvm-dev at lists.llvm.org>, "Bruce Hoult" <
>>>>>>>>> bruce at hoult.org>
>>>>>>>>> *Sent: *Tuesday, March 22, 2016 10:18:03 AM
>>>>>>>>> *Subject: *Re: [llvm-dev] Need help with code generation
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On Tue, Mar 22, 2016 at 4:27 AM, Rafael Espíndola <
>>>>>>>>> llvm-dev at lists.llvm.org> wrote:
>>>>>>>>>
>>>>>>>>>> > Maybe not, but it's not impossible either - browsers manage to
>>>>>>>>>> harden themselves against malicious input and they operate in a far hostile
>>>>>>>>>> environment with many more input formats than we do.
>>>>>>>>>>
>>>>>>>>>> It is important to note how different they are. Both Firefox and
>>>>>>>>>> Chromium have people working just to try to make them more secure.
>>>>>>>>>> Compare that with LLVM: One week ago I pointed out that your patch
>>>>>>>>>> (r263521) introduces a crash. It still hasn't been reverted or
>>>>>>>>>> even
>>>>>>>>>> acknowledge yet.
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> > I'm not trying to shift your personal goal, or to direct the
>>>>>>>>>> features that you choose to put your time into, but I am interested in
>>>>>>>>>> project policy.
>>>>>>>>>>
>>>>>>>>>> Why do you care about policy that is not followed? A policy saying
>>>>>>>>>> llvm should not crash on any input is as relevant as one that says
>>>>>>>>>> that clang should keep bootstrapping in under one second.
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>> It's pretty different when you say, essentially, that patches to
>>>>>>>>> address these things are unlikely to be accepted. It doesn't seem
>>>>>>>>> surprising that people wouldn't try to provide those patches and would
>>>>>>>>> choose not to use the project if that's the expressed policy of the
>>>>>>>>> developers on the project and doesn't line up with the needs of other
>>>>>>>>> people.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> +1
>>>>>>>>>
>>>>>>>>>  -Hal
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> So, if we stick to reality, what we have is that lld (ELF and
>>>>>>>>>> COFF)
>>>>>>>>>> are already the most reliable parts of the toolchain. If not for
>>>>>>>>>> Rui
>>>>>>>>>> and I being upfront about it most people would not even know that
>>>>>>>>>> you
>>>>>>>>>> could crash it. So please, just let us keep working on the most
>>>>>>>>>> reliable part of the toolchain.
>>>>>>>>>>
>>>>>>>>>> Cheers,
>>>>>>>>>> Rafael
>>>>>>>>>> _______________________________________________
>>>>>>>>>> LLVM Developers mailing list
>>>>>>>>>> llvm-dev at lists.llvm.org
>>>>>>>>>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> _______________________________________________
>>>>>>>>> LLVM Developers mailing list
>>>>>>>>> llvm-dev at lists.llvm.org
>>>>>>>>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> --
>>>>>>>>> Hal Finkel
>>>>>>>>> Assistant Computational Scientist
>>>>>>>>> Leadership Computing Facility
>>>>>>>>> Argonne National Laboratory
>>>>>>>>>
>>>>>>>>> _______________________________________________
>>>>>>>>> LLVM Developers mailing list
>>>>>>>>> llvm-dev at lists.llvm.org
>>>>>>>>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20160323/0c091637/attachment.html>