[cfe-dev] RFC: ODR checker for Clang and LLD

David Blaikie via cfe-dev cfe-dev at lists.llvm.org
Wed Jun 7 10:00:02 PDT 2017


On Wed, Jun 7, 2017 at 9:37 AM Rui Ueyama <ruiu at google.com> wrote:

> As to the performance on the linker side, one thing I'd like to note about
> is that we might be able to use some probabilistic approach (e.g. a bloom
> filter-ish data structure). In that sense, .odrtab doesn't have to contain
> complete information to detect ODR violations. Instead, it can contain
> hints. If the table allows us quickly verify that there's no ODR violation
> with 99.99% probability for each identifier, for example, then we can fall
> back to the debug info to see if the remaining 0.01% are real, and the cost
> of false positive is probably negligible.
>

The main(only?) place where there are false positives are when there are
mixed binaries containing both Clang and GCC object files which take
different choices about exactyl where the starting line of a function is,
for example.

So perhaps there wouldn't be false positives if the only thing that were
considered were Clang built binaries (those containing this special odrtab
section). Though the gold ODR checker may've been more restrictive it what
it caught - with the ODR hashes Clang's creating this can be a far more
aggressive ODR checking - and if it always fell back to debug info, there
might be many false negatives? (these two hashes are different for the same
mangled name, but the debug info doesn't encode that difference - so the
diagnostic would not be emitted)

Falling back to debug info also means a lot more code in LLD to support
doing that analysis on the debug info.


> I can imagine for example that we can store a 32-bit hash for each mangled
> name and compare hashes instead of large strings.
>

Many of the mangled names (those of functions) may already be present in
strtab? Would it be valid/reasonable for this section to refer to those
strings to avoid some duplication?


>
> On Wed, Jun 7, 2017 at 8:18 AM, David Blaikie via cfe-dev <
> cfe-dev at lists.llvm.org> wrote:
>
>> Does this need LLVM support - or is there some generic representation
>> that could be used instead? (I guess LLVM would want to be aware of it when
>> merging modules though, so maybe it's worth having a first-class
>> representation - though LLVM module linking could special case a section
>> the same way the linker could/would - not sure what's the better choice
>> there)
>>
>> I was thinking (hand-wavingly vague since I don't know that much about
>> object files, etc) one of those auto-appending sections and an array of
>> constchar*+hash attributed to that section. (then even without an
>> odr-checking aware linker (which would compare and discard these sections)
>> the data could be merged & a post-processing pass on the binary could still
>> point out ODR violations without anything in the toolchain (except clang)
>> needing to support this extra info)
>>
>> On Tue, Jun 6, 2017 at 10:41 PM Peter Collingbourne via cfe-dev <
>> cfe-dev at lists.llvm.org> wrote:
>>
>>> Hi all,
>>>
>>> I'd like to propose an ODR checker feature for Clang and LLD. The
>>> feature would be similar to gold's --detect-odr-violations feature, but
>>> better: we can rely on integration with clang to avoid relying on debug
>>> info and to perform more precise matching.
>>>
>>> The basic idea is that we use clang's ability to create ODR hashes for
>>> declarations. ODR hashes are computed using all information about a
>>> declaration that is ODR-relevant. If the flag -fdetect-odr-violations is
>>> passed, Clang will store the ODR hashes in a so-called ODR table in each
>>> object file. Each ODR table will contain a mapping from mangled declaration
>>> names to ODR hashes. At link time, the linker will read the ODR table and
>>> report any mismatches.
>>>
>>> To make this work:
>>> - LLVM will be extended with the ability to represent ODR tables in the
>>> IR and emit them to object files
>>> - Clang will be extended with the ability to emit ODR tables using ODR
>>> hashes
>>> - LLD will be extended to read ODR tables from object files
>>>
>>> I have implemented a prototype of this feature. It is available here:
>>> https://github.com/pcc/llvm-project/tree/odr-checker and some results
>>> from applying it to chromium are here: crbug.com/726071
>>> As you can see it did indeed find a number of real ODR violations in
>>> Chromium, including some that wouldn't be detectable using debug info.
>>>
>>> If you're interested in what the format of the ODR table would look
>>> like, that prototype shows pretty much what I had in mind, but I expect
>>> many other aspects of the implementation to change as it is upstreamed.
>>>
>>> Thanks,
>>> --
>>> --
>>> Peter
>>> _______________________________________________
>>> cfe-dev mailing list
>>> cfe-dev at lists.llvm.org
>>> http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
>>>
>>
>> _______________________________________________
>> cfe-dev mailing list
>> cfe-dev at lists.llvm.org
>> http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/cfe-dev/attachments/20170607/fb43cd86/attachment.html>


More information about the cfe-dev mailing list