[PATCH] D16599: ELF: Define another entry point.

Tue Feb 2 15:08:08 PST 2016

Sounds good to me. We can wordsmith it afterwards if needed.

Cheers,
Rafael

On 2 February 2016 at 18:04, Rui Ueyama via llvm-commits
<llvm-commits at lists.llvm.org> wrote:
> I'm going to add to the linker.
>
> +// Entry point of the ELF linker. Returns true on success. It is
> +// guaranteed to return as long as you do not pass corrupted or malicious
> +// object files. A corrupted file could cause a fatal error or SEGV.
> +// That being said, you don't need to worry too much about it if you
> +// create object files in a usual way and feed it to the linker
> +// (it is naturally expected to work, or otherwise that's a linker's bug.)
>  bool link(ArrayRef<const char *> Args, llvm::raw_ostream &Error =
> llvm::errs());
>
>
> On Tue, Feb 2, 2016 at 2:25 PM, Rui Ueyama <ruiu at google.com> wrote:
>>
>> On Tue, Feb 2, 2016 at 1:42 PM, Sean Silva <chisophugis at gmail.com> wrote:
>>>
>>>
>>>
>>> On Tue, Feb 2, 2016 at 8:44 AM, David Blaikie <dblaikie at gmail.com> wrote:
>>>>
>>>>
>>>>
>>>> On Mon, Feb 1, 2016 at 11:05 PM, Sean Silva via llvm-commits
>>>> <llvm-commits at lists.llvm.org> wrote:
>>>>>
>>>>>
>>>>>
>>>>> On Mon, Feb 1, 2016 at 12:27 PM, Rui Ueyama <ruiu at google.com> wrote:
>>>>>>
>>>>>> Even if a file is technically sane, you can craft a malicious one; for
>>>>>> example, you can probably crash the linker by OOM by setting a very large
>>>>>> number as an alignment requirement for each section so that the size of
>>>>>> output becomes huge. It is easily doable using assembly. So my answer is
>>>>>> "any clang or gcc produced .o not including inline asm". (It does not mean
>>>>>> that we do not try to recover from errors caused by bad assembly code, but
>>>>>> we don't/can't guarantee 100% recovery.)
>>>>>
>>>>>
>>>>> You can probably find some way to set the alignment using an attribute
>>>>> or whatever even from clang (and without inlineasm).
>>>>>
>>>>> I don't think there is a platonically-ideal answer for this. It's more
>>>>> about goals:
>>>>> - as a command line tool, we don't want legitimate users to see us
>>>>> crashing during normal use (if a user is intentionally trying to kill LLD,
>>>>> it is not as embarrassing though, so we don't need to worry much about that
>>>>> case).
>>>>> - we want to be useful (someday) as a library that can be safely used
>>>>> in-process, so we need to provide certain guarantees (but these are not
>>>>> hugely constraining, because we can assume that the calling code is
>>>>> programmatically generating the file in good faith).
>>>>
>>>>
>>>> I don't think this is a valid assumption for all programmatic users (&
>>>> indeed Clang and LLVM both have ways of accepting untrusted inputs - the
>>>> assumption in LLVM is "if it's not already in the in-memory representation,
>>>> it's not trusted" (parsing bitcode, reading files, etc) and I think the same
>>>> would probably be reasonable in lld - callers with object contents in memory
>>>> (or even a higher level representation - the same as the difference between
>>>> LLVM IR and LLVM bitcode in a memory buffer) can choose to have lld assume
>>>> validity (if they produced it from an API they trust/are willing to bugfix
>>>> if it's ever wrong) or ask for verification (if they got the object over a
>>>> network connection or other untrusted source (perhaps read it out of a
>>>> compressed archive, etc))). An API integration of LLD into the Clang driver
>>>> wouldn't be a sound place to make this assumption - some objects may be
>>>> passed to Clang (not generated by it) from some other compilation or source,
>>>> for example.
>>>
>>>
>>> I think these can serve as a baseline that we can document / elaborate on
>>> down the road though.
>>> For the moment, we can document our current intentions/policies. That way
>>> people can either a) concretely file bug reports against us for violating
>>> our intentions or b) we can have a concrete discussion on llvm-dev about
>>> changing those documented policies/intentions.
>>
>>
>> Good point. We need to document the current policy whatever it is. And the
>> current policy after I submit these pending patches is that "the linker
>> doesn't crash or exit (or it is a bug) as long as you don't give
>> corrupted/malicious object files." I will write that to the Driver file
>> which all people who wants to use will see.
>>
>>> It seems our current situation is that any time anything related to this
>>> comes up, everybody and their dog start talking about different hypothetical
>>> situations that nobody is actively working on using LLD for (since there are
>>> other, higher priorities right now). These may or may not be true, or the
>>> parallels to clang/LLVM may or may not be true, but currently we don't have
>>> a starting point for a useful discussion. It is all ad-hoc. We need a fixed
>>> point of reference for future discussion and what I posted (in this thread
>>> and others) seems like a sweet spot to start with; it provides reasonable
>>> guarantees and avoids overcommitting our development effort at an early
>>> stage.
>>>
>>> I actually have points to say in response to what you said, but here in
>>> an llvm-commits discussion is not the right place to discuss it.
>>>
>>> -- Sean Silva
>>>
>>>>>
>>>>>
>>>>> -- Sean Silva
>>>>>
>>>>>>
>>>>>>
>>>>>> On Mon, Feb 1, 2016 at 12:11 PM, Rafael Espíndola
>>>>>> <rafael.espindola at gmail.com> wrote:
>>>>>>>
>>>>>>> On 1 February 2016 at 15:06, Rui Ueyama <ruiu at google.com> wrote:
>>>>>>> > On Mon, Feb 1, 2016 at 11:57 AM, Rafael Espíndola
>>>>>>> > <rafael.espindola at gmail.com> wrote:
>>>>>>> >>
>>>>>>> >> On 1 February 2016 at 14:46, Sean Silva <chisophugis at gmail.com>
>>>>>>> >> wrote:
>>>>>>> >> > I think one of the main use cases that has been requested is to
>>>>>>> >> > be able
>>>>>>> >> > to
>>>>>>> >> > programmatically call the linker with "known good" object files
>>>>>>> >> > (i.e.
>>>>>>> >> > produced by the compiler). That simplifies things a lot. Rui's
>>>>>>> >> > recent
>>>>>>> >> > patches that are thread_local'izing existing globals seems like
>>>>>>> >> > a
>>>>>>> >> > satisfactory approach. Or am I missing something?
>>>>>>> >>
>>>>>>> >> Yes, known good files are a lot easier to handle. We just have to
>>>>>>> >> be
>>>>>>> >> clear what "known good" is.
>>>>>>> >>
>>>>>>> >> > The R_X86_64_REX_GOTPCRELX situation can probably be likened to
>>>>>>> >> > someone
>>>>>>> >> > giving clang a piece of source code with an inline asm that has:
>>>>>>> >> >
>>>>>>> >> > .text
>>>>>>> >> > .byte <some garbage>
>>>>>>> >> >
>>>>>>> >> > in it. We don't guarantee that the output "makes sense" because
>>>>>>> >> > there's
>>>>>>> >> > really no way for us to know what "makes sense" in a precise way
>>>>>>> >> > (i.e.,
>>>>>>> >> > a
>>>>>>> >> > way that we can program).
>>>>>>> >>
>>>>>>> >> Would we still be required to check the offsets so we don't crash?
>>>>>>> >> An
>>>>>>> >> assembly file can contain
>>>>>>> >>
>>>>>>> >> .reloc 0, R_X86_64_REX_GOTPCRELX, foo
>>>>>>> >> .long 4
>>>>>>> >>
>>>>>>> >> which would put that relocation in an invalid location. In
>>>>>>> >> general, is
>>>>>>> >> an arbitrary assembly file to be considered "known good"? Is that
>>>>>>> >> true
>>>>>>> >> even for things like
>>>>>>> >>
>>>>>>> >> .section .eh_frame, ....
>>>>>>> >> garbage
>>>>>>> >>
>>>>>>> >> that the linker has to parse?
>>>>>>> >
>>>>>>> >
>>>>>>> > I think the answer is case-by-case, but I don't think we have to
>>>>>>> > guarantee
>>>>>>> > to recover from errors caused by carefully-crafted malicious object
>>>>>>> > files.
>>>>>>> > (Is there anyone who disagrees with that?)
>>>>>>>
>>>>>>> It is definitely not a use case *I* have an interest in. I just want
>>>>>>> to be an agreement on what use case we want to support at the moment.
>>>>>>> Is it "any .o file", "any llvm-mc or gas produced .o", "any clang or
>>>>>>> gcc produced .o not including inline asm"?
>>>>>>>
>>>>>>> Cheers,
>>>>>>> Rafael
>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>> _______________________________________________
>>>>> llvm-commits mailing list
>>>>> llvm-commits at lists.llvm.org
>>>>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-commits
>>>>>
>>>>
>>>
>>
>
>
> _______________________________________________
> llvm-commits mailing list
> llvm-commits at lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-commits
>