[PATCH] D16599: ELF: Define another entry point.

Tue Feb 2 08:44:57 PST 2016

On Mon, Feb 1, 2016 at 11:05 PM, Sean Silva via llvm-commits <
llvm-commits at lists.llvm.org> wrote:

>
>
> On Mon, Feb 1, 2016 at 12:27 PM, Rui Ueyama <ruiu at google.com> wrote:
>
>> Even if a file is technically sane, you can craft a malicious one; for
>> example, you can probably crash the linker by OOM by setting a very large
>> number as an alignment requirement for each section so that the size of
>> output becomes huge. It is easily doable using assembly. So my answer
>> is "any clang or gcc produced .o not including inline asm". (It does not
>> mean that we do not try to recover from errors caused by bad assembly code,
>> but we don't/can't guarantee 100% recovery.)
>>
>
> You can probably find some way to set the alignment using an attribute or
> whatever even from clang (and without inlineasm).
>
> I don't think there is a platonically-ideal answer for this. It's more
> about goals:
> - as a command line tool, we don't want legitimate users to see us
> crashing during normal use (if a user is intentionally trying to kill LLD,
> it is not as embarrassing though, so we don't need to worry much about that
> case).
> - we want to be useful (someday) as a library that can be safely used
> in-process, so we need to provide certain guarantees (but these are not
> hugely constraining, because we can assume that the calling code is
> programmatically generating the file in good faith).
>

I don't think this is a valid assumption for all programmatic users (&
indeed Clang and LLVM both have ways of accepting untrusted inputs - the
assumption in LLVM is "if it's not already in the in-memory representation,
it's not trusted" (parsing bitcode, reading files, etc) and I think the
same would probably be reasonable in lld - callers with object contents in
memory (or even a higher level representation - the same as the difference
between LLVM IR and LLVM bitcode in a memory buffer) can choose to have lld
assume validity (if they produced it from an API they trust/are willing to
bugfix if it's ever wrong) or ask for verification (if they got the object
over a network connection or other untrusted source (perhaps read it out of
a compressed archive, etc))). An API integration of LLD into the Clang
driver wouldn't be a sound place to make this assumption - some objects may
be passed to Clang (not generated by it) from some other compilation or
source, for example.

>
> -- Sean Silva
>
>
>>
>> On Mon, Feb 1, 2016 at 12:11 PM, Rafael Espíndola <
>> rafael.espindola at gmail.com> wrote:
>>
>>> On 1 February 2016 at 15:06, Rui Ueyama <ruiu at google.com> wrote:
>>> > On Mon, Feb 1, 2016 at 11:57 AM, Rafael Espíndola
>>> > <rafael.espindola at gmail.com> wrote:
>>> >>
>>> >> On 1 February 2016 at 14:46, Sean Silva <chisophugis at gmail.com>
>>> wrote:
>>> >> > I think one of the main use cases that has been requested is to be
>>> able
>>> >> > to
>>> >> > programmatically call the linker with "known good" object files
>>> (i.e.
>>> >> > produced by the compiler). That simplifies things a lot. Rui's
>>> recent
>>> >> > patches that are thread_local'izing existing globals seems like a
>>> >> > satisfactory approach. Or am I missing something?
>>> >>
>>> >> Yes, known good files are a lot easier to handle. We just have to be
>>> >> clear what "known good" is.
>>> >>
>>> >> > The R_X86_64_REX_GOTPCRELX situation can probably be likened to
>>> someone
>>> >> > giving clang a piece of source code with an inline asm that has:
>>> >> >
>>> >> > .text
>>> >> > .byte <some garbage>
>>> >> >
>>> >> > in it. We don't guarantee that the output "makes sense" because
>>> there's
>>> >> > really no way for us to know what "makes sense" in a precise way
>>> (i.e.,
>>> >> > a
>>> >> > way that we can program).
>>> >>
>>> >> Would we still be required to check the offsets so we don't crash? An
>>> >> assembly file can contain
>>> >>
>>> >> .reloc 0, R_X86_64_REX_GOTPCRELX, foo
>>> >> .long 4
>>> >>
>>> >> which would put that relocation in an invalid location. In general, is
>>> >> an arbitrary assembly file to be considered "known good"? Is that true
>>> >> even for things like
>>> >>
>>> >> .section .eh_frame, ....
>>> >> garbage
>>> >>
>>> >> that the linker has to parse?
>>> >
>>> >
>>> > I think the answer is case-by-case, but I don't think we have to
>>> guarantee
>>> > to recover from errors caused by carefully-crafted malicious object
>>> files.
>>> > (Is there anyone who disagrees with that?)
>>>
>>> It is definitely not a use case *I* have an interest in. I just want
>>> to be an agreement on what use case we want to support at the moment.
>>> Is it "any .o file", "any llvm-mc or gas produced .o", "any clang or
>>> gcc produced .o not including inline asm"?
>>>
>>> Cheers,
>>> Rafael
>>>
>>
>>
>
> _______________________________________________
> llvm-commits mailing list
> llvm-commits at lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-commits
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20160202/b4630eeb/attachment.html>