[llvm-dev] RFC: Loadable segments watermark for lld

Chris Jackson via llvm-dev llvm-dev at lists.llvm.org
Mon Dec 2 10:16:58 PST 2019


There is discussion of this on Phabricator (https://reviews.llvm.org/D66426).
There is no threat model
as this is not a security feature. We are trying to detect post-link
modifications that result
in a binary that relies on incidental details of the OS. Reliance on these
details may impair future
work on the platform. Also, if post-link modifications are detected then we
may be able to identify
functionality that is lacking in our platform.

On Fri, Nov 29, 2019 at 1:32 PM Jake Ehrlich <jakehehrlich at google.com>
wrote:

> Could you clarify the threat model? Are we preventing bugs or malicious
> attackers?
>
> On Fri, Nov 29, 2019, 3:58 AM Chris Jackson via llvm-dev <
> llvm-dev at lists.llvm.org> wrote:
>
>> The whole point of the watermark is to show that no post-link
>> modifications have been made, and if the watermark itself is added
>> post-link, it does not achieve this aim: someone could either
>> deliberately or accidentally add a step prior to the watermarking happening.
>>
>> In our case we always map .note.llvm.watermark to a PT_NOTE segment with
>> a linker script. Thus, llvm-objcopy --strip-all does not discard the
>> section. If --strip-all is used when the section is not mapped to a
>> segment, --keep-section can be used to preserve the watermark section.
>>
>> On Wed, Nov 27, 2019 at 7:40 PM Fāng-ruì Sòng <maskray at google.com> wrote:
>>
>>> From previous reviews I recall that .note.llvm.watermark is a
>>> non-SHF_ALLOC section. Linkers normally place non-SHF_ALLOC sections after
>>> SHF_ALLOC ones. I think a post-link tool can be used to append
>>> .note.llvm.watermark to an ELF file. It just needs to update tens to a few
>>> hundred bytes (section header table+content of .note.llvm.watermark),
>>> assuming the position of .note.llvm.watermark does not matter.
>>>
>>> I feel that the reasoning of building .note.llvm.watermark being an lld
>>> feature is not sufficiently strong. Does it need to be fast? (A benchmark
>>> measuring the performance will be useful.) .note.llvm.watermark seems to be
>>> only used in releases. Releases naturally involve a lot of preparation and
>>> verification. I can't imagine that running a post-link tool can be a
>>> bottleneck. During development .note.llvm.watermark is probably not very
>>> useful.
>>>
>>> If .note.llvm.watermark is indeed non-SHF_ALLOC, it can be discarded by
>>> llvm-objcopy/llvm-strip --strip-all, but not by --strip-all-gnu
>>> (objcopy/strip --strip-all). Is this an expected modification?
>>>
>>> If we reach the consensus that this section is useful, llvm-objcopy may
>>> be the right place to implement the update/verification features. If the
>>> performance is really critical (see my question mentioned before), we
>>> probably need to make llvm-objcopy's in-place update fast by not
>>> overwriting contents that are not changed.
>>>
>>> > *Is computing memory-mapped sections strong enough to detect
>>> post-link modifications?*
>>>
>>> In most cases, yes. A lot of people (including me) hold the opinion that
>>> non-SHF_ALLOC parts should not affect runtime execution. There are some
>>> counter-examples (runtime introspection), though. 1) The non-SHF_ALLOC
>>> .ARM.attributes (https://reviews.llvm.org/D69188) is used by Debian
>>> patched glibc ld.so. 2) The .ctf developers intend .ctf to be
>>> non-strippable https://sourceware.org/ml/binutils/2019-09/msg00209.html (see
>>> the thread in October; I even implemented objcopy --keep-section for them
>>> but I may likely lose the battle).
>>>
>>>
>>> On Tue, Nov 26, 2019 at 11:15 PM Jake Ehrlich via llvm-dev <
>>> llvm-dev at lists.llvm.org> wrote:
>>>
>>>> The ELF file header isn't always covered by a segment but still affects
>>>> loading. I think everything else that effects loading/dynamic linking is
>>>> always covered by a PT_LOAD segment. As evidence this is basically what
>>>> --strip-sections in llvm-strip and eu-strip do and they produce perfectly
>>>> runnable binaries.
>>>>
>>>> Having a hash of the actual memory map is interesting IMO. Build IDs
>>>> can't really be verified but a hash of the memory map would be loadable
>>>> with the expected semantics if and only if the hash was verifiable. So if
>>>> there's a use case for verification, then this seems sensible to me. I'm
>>>> not sure where such a verification matters however.
>>>>
>>>> On Tue, Nov 26, 2019, 10:04 PM Rui Ueyama via llvm-dev <
>>>> llvm-dev at lists.llvm.org> wrote:
>>>>
>>>>> Hi Chris,
>>>>>
>>>>> Thank you for starting the thread! To make things clear what the
>>>>> proposed feature can or can't do, let me ask a few questions (allow me to
>>>>> ask duplicate questions that were discussed in the review thread):
>>>>>
>>>>> *Why build-id is not sufficient?*
>>>>>
>>>>> If you pass -build-id to the linker, the linker computes a hash of an
>>>>> entire output file and append it to a .note section. This is not intended
>>>>> to be a checksum but more like just a unique identifier. But you might be
>>>>> able to use it as a checksum and detect any post-link modification by
>>>>> recomputing build-id and compare it with the content of a .note section.
>>>>>
>>>>> *What kind of post-link modification are you expecting?*
>>>>>
>>>>> The first thing that comes to mind is strip command which removes
>>>>> debug info and symbol table. But it looks like you are expecting more than
>>>>> that?
>>>>>
>>>>> *Is computing memory-mapped sections strong enough to detect post-link
>>>>> modifications?*
>>>>>
>>>>> I wonder if there's some section or an ELF header field that does not
>>>>> mapped to memory at run-time but affects how the loader works. If such a
>>>>> thing exists, computing a hash of all memory-mapped sections is not enough
>>>>> to catch post-link modifications.
>>>>>
>>>>> On Thu, Nov 21, 2019 at 8:43 PM Chris Jackson via llvm-dev <
>>>>> llvm-dev at lists.llvm.org> wrote:
>>>>>
>>>>>> Hello all,
>>>>>>
>>>>>> I'm implementing a watermarking feature for lld that computes a hash
>>>>>> of loadable
>>>>>> segments and places the result in a note section. Ongoing work can be
>>>>>> found
>>>>>> here:
>>>>>>
>>>>>> https://reviews.llvm.org/D70316
>>>>>> https://reviews.llvm.org/D66426
>>>>>>
>>>>>> The purpose of this watermark is to enable detection of post-link
>>>>>> modifications
>>>>>> to the loadable segments of the binary. Such modifications may
>>>>>> produce a binary
>>>>>> that relies on functionality that is an incidental detail of the OS
>>>>>> that may
>>>>>> change in a future update and negatively affect the runtime behaviour
>>>>>> of the
>>>>>> binary.
>>>>>>
>>>>>> As well as identifying reliance on unspecified behaviour, on
>>>>>> detection of
>>>>>> post-link changes we can then look at improving our tooling to
>>>>>> support whatever
>>>>>> changes had been applied.
>>>>>>
>>>>>> Its critical for us that the watermark has minimal impact on build
>>>>>> time and
>>>>>> cryptographic security is not the goal. Hence, xxhash is used as our
>>>>>> experiments showed it has minimal overhead.
>>>>>>
>>>>>> Chris
>>>>>> _______________________________________________
>>>>>> LLVM Developers mailing list
>>>>>> llvm-dev at lists.llvm.org
>>>>>> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>>>>>
>>>>> _______________________________________________
>>>>> LLVM Developers mailing list
>>>>> llvm-dev at lists.llvm.org
>>>>> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>>>>
>>>> _______________________________________________
>>>> LLVM Developers mailing list
>>>> llvm-dev at lists.llvm.org
>>>> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>>>
>>>
>>>
>>> --
>>> 宋方睿
>>>
>> _______________________________________________
>> LLVM Developers mailing list
>> llvm-dev at lists.llvm.org
>> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20191202/69148717/attachment.html>


More information about the llvm-dev mailing list