[llvm-dev] [cfe-dev] RFC: ODR checker for Clang and LLD

Peter Collingbourne via llvm-dev llvm-dev at lists.llvm.org
Sat Jun 10 11:39:18 PDT 2017


On Sat, Jun 10, 2017 at 11:00 AM, Mehdi AMINI <joker.eph at gmail.com> wrote:

>
>
> 2017-06-07 16:32 GMT-07:00 Peter Collingbourne via llvm-dev <
> llvm-dev at lists.llvm.org>:
>
>> On Wed, Jun 7, 2017 at 8:18 AM, David Blaikie <dblaikie at gmail.com> wrote:
>>
>>> Does this need LLVM support - or is there some generic representation
>>> that could be used instead? (I guess LLVM would want to be aware of it when
>>> merging modules though, so maybe it's worth having a first-class
>>> representation - though LLVM module linking could special case a section
>>> the same way the linker could/would - not sure what's the better choice
>>> there)
>>>
>>
>> The only thing that LLVM needs to do is to have some way to store a blob
>> that will be emitted to the object file. In my prototype I just create a
>> GlobalVariable with private linkage, I think in the final version I will
>> use an MDString referenced by a named MD node. I don't think we would want
>> a higher level representation -- I'd imagine that the blob would be
>> entirely a property of the source code, so I can't see anything that an
>> IR-level pass would want to do with it. It's similar to some parts of debug
>> info in that there's no real benefit to representing it as anything other
>> than a blob.
>>
>
> If you IR Link two modules, you'd want to check that the hash matches and
> diagnose immediately before dropping link once_odr functions, right?
>

The ODR table (including its string table) is created by the compiler and
is unaffected by dropped functions. Functions can be dropped by the
optimizer in the non-LTO case as well, of course, and we'd still want to be
able to diagnose ODR mismatches in those functions.

In any case, I think we'd want the ODR checker to always be driven by the
linker, not the IRMover. This is so that we can properly diagnose ODR
mismatches between bitcode files and regular object files, and avoid
needing to deal with duplicate diagnostics (say if we import the same pair
of ODR-mismatching modules into two different ThinLTO backends), and avoid
bad interactions with the ThinLTO cache (we don't want to miss diagnostics
because of a ThinLTO cache hit). The way I imagine that this would work is
that we'd add something to lto::InputFile that gives the client access to
the ODR table.

That said, we may want to reconsider the blob representation if we want to
try and share strings between the object file's string table and the ODR
table, as discussed elsewhere.

Peter


> --
> Mehdi
>
>
>
>>
>> I was thinking (hand-wavingly vague since I don't know that much about
>>> object files, etc) one of those auto-appending sections and an array of
>>> constchar*+hash attributed to that section. (then even without an
>>> odr-checking aware linker (which would compare and discard these sections)
>>> the data could be merged & a post-processing pass on the binary could still
>>> point out ODR violations without anything in the toolchain (except clang)
>>> needing to support this extra info)
>>>
>>
>> Linkers merge section contents by section name, so you wouldn't need
>> anything other than for the object files to agree on a section name. The
>> odrtab header in my prototype has a size field, so we could use that to
>> split an .odrtab section into multiple odrtabs.
>>
>> Peter
>>
>> On Tue, Jun 6, 2017 at 10:41 PM Peter Collingbourne via cfe-dev <
>>> cfe-dev at lists.llvm.org> wrote:
>>>
>>>> Hi all,
>>>>
>>>> I'd like to propose an ODR checker feature for Clang and LLD. The
>>>> feature would be similar to gold's --detect-odr-violations feature, but
>>>> better: we can rely on integration with clang to avoid relying on debug
>>>> info and to perform more precise matching.
>>>>
>>>> The basic idea is that we use clang's ability to create ODR hashes for
>>>> declarations. ODR hashes are computed using all information about a
>>>> declaration that is ODR-relevant. If the flag -fdetect-odr-violations is
>>>> passed, Clang will store the ODR hashes in a so-called ODR table in each
>>>> object file. Each ODR table will contain a mapping from mangled declaration
>>>> names to ODR hashes. At link time, the linker will read the ODR table and
>>>> report any mismatches.
>>>>
>>>> To make this work:
>>>> - LLVM will be extended with the ability to represent ODR tables in the
>>>> IR and emit them to object files
>>>> - Clang will be extended with the ability to emit ODR tables using ODR
>>>> hashes
>>>> - LLD will be extended to read ODR tables from object files
>>>>
>>>> I have implemented a prototype of this feature. It is available here:
>>>> https://github.com/pcc/llvm-project/tree/odr-checker and some results
>>>> from applying it to chromium are here: crbug.com/726071
>>>> As you can see it did indeed find a number of real ODR violations in
>>>> Chromium, including some that wouldn't be detectable using debug info.
>>>>
>>>> If you're interested in what the format of the ODR table would look
>>>> like, that prototype shows pretty much what I had in mind, but I expect
>>>> many other aspects of the implementation to change as it is upstreamed.
>>>>
>>>> Thanks,
>>>> --
>>>> --
>>>> Peter
>>>> _______________________________________________
>>>> cfe-dev mailing list
>>>> cfe-dev at lists.llvm.org
>>>> http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
>>>>
>>>
>>
>>
>> --
>> --
>> Peter
>>
>> _______________________________________________
>> LLVM Developers mailing list
>> llvm-dev at lists.llvm.org
>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>
>>
>


-- 
-- 
Peter
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20170610/84808653/attachment.html>


More information about the llvm-dev mailing list