[PATCH] D17321: DIEData, DIEWriter: introduce and begin migration.

Eric Christopher via llvm-commits llvm-commits at lists.llvm.org
Sun Feb 21 16:42:56 PST 2016


On Sun, Feb 21, 2016 at 4:41 PM David Blaikie <dblaikie at gmail.com> wrote:

> On Fri, Feb 19, 2016 at 7:06 PM, Eric Christopher via llvm-commits <
> llvm-commits at lists.llvm.org> wrote:
>
>>
>>
>> On Fri, Feb 19, 2016 at 6:52 PM Robinson, Paul <
>> Paul_Robinson at playstation.sony.com> wrote:
>>
>>> Type units are COMDATs produced across (potentially) many compilations,
>>> so the key has to correspond to the content in a predictable/consistent way
>>> in order for the linker to throw away duplicates. Therefore the type-unit
>>> hash is specified the way it is.
>>>
>>
>> Doesn't matter. The way that we're doing it in clang is perfectly
>> acceptable for this.
>>
>
> Ish - there are tradeoffs for sure. The way we're doing this in Clang
> means that if the user violates the ODR they get a weirder experience in
> the debugger than not. But we decided to make that tradeoff to
> simplify/optimize/etc the implementation.
>
>
>>
>>
>>> dwo_ids are just identifiers matching two lumps of data for the same CU,
>>> produced by the same compilation, so they aren't sensitive to content (at
>>> least not in the same way as type units).  But it would seem prudent for
>>> two compilations of the "same" source to have different IDs, thus hashing
>>> the .debug_info content might not be the best choice; a more random-number
>>> kind of ID would be more appropriate.
>>>
>>
>> I don't see this either, but I know where you're coming from.
>>
>
> Agreed - further, having compilation of the "same" source having different
> IDs is an anti-goal, it would mean non-reproducible/stable builds, which we
> certainly don't want.
>
> Ideally, the DWO ID wouldn't change if you change text in a comment, even,
> or the name of a macro (if you don't have macro debug info enabled - though
> currently it would not change even if you did have macro debug info
> enabled, which is unfortunate). That way, an optimal build system would
> rebuild the .o/.dwo, notice that they are identical to the previous ones,
> and not rebuild the linked executable or the dwp (if any).
>
>

Yep


>
>> -eric
>>
>>
>>> --paulr
>>>
>>>
>>>
>>> *From:* llvm-commits [mailto:llvm-commits-bounces at lists.llvm.org] *On
>>> Behalf Of *Eric Christopher via llvm-commits
>>> *Sent:* Friday, February 19, 2016 6:39 PM
>>> *To:* Peter Collingbourne; Adrian Prantl
>>> *Cc:* llvm-commits at lists.llvm.org
>>> *Subject:* Re: [PATCH] D17321: DIEData, DIEWriter: introduce and begin
>>> migration.
>>>
>>>
>>>
>>> On Fri, Feb 19, 2016 at 5:45 PM Peter Collingbourne via llvm-commits <
>>> llvm-commits at lists.llvm.org> wrote:
>>>
>>> On Fri, Feb 19, 2016 at 03:47:22PM -0800, Adrian Prantl wrote:
>>> >
>>> > > On Feb 19, 2016, at 12:31 PM, Peter Collingbourne via llvm-commits <
>>> llvm-commits at lists.llvm.org> wrote:
>>> > >
>>> > > them incorrectly, see DwarfDebug::makeTypeSignature). Since the page
>>> at
>>> > > https://gcc.gnu.org/wiki/DebugFission does not specify how the hash
>>> is to be
>>> > > computed, could we maybe do something simpler (e.g. MD5 hash the DIE
>>> bytes
>>> > > themselves) and avoid needing to use something like the DWARF type
>>> hashing
>>> > > algorithm for this?
>>> >
>>> > Where it can LLVM uses an MD5 sum over the mangled name of the type to
>>> avoid the expensive DWARF type hashing.
>>> >
>>> > Just a side node: Debug info fission is in the process of being
>>> standardized in DWARF 5. The progress can be tracked here:
>>> >   http://dwarfstd.org/Issues.php?type=closed4 (search for “fission”
>>> and “split DWARF”)
>>> > The DWARF specification may deviate from the description on the GCC
>>> wiki a little, but we probably want to follow the DWARF spec where that
>>> makes sense.
>>>
>>> I see. From http://dwarfstd.org/ShowIssue.php?issue=130313.4 :
>>>
>>> >   5.  A DW_AT_dwo_id attribute whose value is an 8-byte
>>> >       unsigned hash of the full compilation unit.  This hash
>>> >       value is computed by the method described in Section 7.27
>>> >       ("Type Signature Computation").
>>>
>>> That's unfortunate. Maybe I'm missing something, but I don't really see
>>> why
>>> the attribute value needs to be specified like that, as the hash is only
>>> used by consumers to match skeleton units with full units, and it should
>>> be
>>> up to the producer to use a good enough hash. Is there any possibility of
>>> changing the specification before DWARF 5 is standardised?
>>>
>>>
>>>
>>> It has been, but it's not really important as any two compilers aren't
>>> going to agree on debug info output anyhow so they're unlikely to match. I
>>> raised this in committee at the time as a "may" rather than "is". That
>>> said, recent text (I don't have an absolutely updated version in front of
>>> me) reads:
>>>
>>>
>>>
>>> A DW_AT_dwo_id attribute whose implementation-defined integer constant
>>> value, known as the compilation unit ID, provides unique identification of
>>> this compilation unit as well as the associated split compilation unit in
>>> the object file named in the DW_AT_dwo_name attribute. For simplicity, the
>>> DW_AT_dwo_id attributes in the skeleton compilation unit and the
>>> corresponding split full compilation unit (see Section 3.1.3 on the
>>> following page) must use the same form to encode this identification value.
>>>
>>>
>>>
>>> The means of determining a compilation unit ID does not need to be
>>> similar or related to the means of determining a type unit signature.
>>>
>>>
>>>
>>> That said, ISTR at the time having a good reason why a hash of the
>>> debug_info section wasn't a sufficient method for unique identifiers, but
>>> my brain isn't coming up with it right now. Might have been related to not
>>> having a finalized section before wanting to splat in the hash, but I'm not
>>> sure.
>>>
>>>
>>>
>>> For the type ids we haven't worried because we're only supporting type
>>> units for C++ types at the moment, if we wanted to support C types we'd
>>> want to come up with another identifier so we could merge similar types
>>> across compilation units.
>>>
>>>
>>>
>>> Hope this helps.
>>>
>>>
>>>
>>> -eric
>>>
>>
>> _______________________________________________
>> llvm-commits mailing list
>> llvm-commits at lists.llvm.org
>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-commits
>>
>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20160222/626343ab/attachment.html>


More information about the llvm-commits mailing list