[PATCH] D17321: DIEData, DIEWriter: introduce and begin migration.

Fri Feb 19 18:39:10 PST 2016

On Fri, Feb 19, 2016 at 5:45 PM Peter Collingbourne via llvm-commits <
llvm-commits at lists.llvm.org> wrote:

> On Fri, Feb 19, 2016 at 03:47:22PM -0800, Adrian Prantl wrote:
> >
> > > On Feb 19, 2016, at 12:31 PM, Peter Collingbourne via llvm-commits <
> llvm-commits at lists.llvm.org> wrote:
> > >
> > > them incorrectly, see DwarfDebug::makeTypeSignature). Since the page at
> > > https://gcc.gnu.org/wiki/DebugFission does not specify how the hash
> is to be
> > > computed, could we maybe do something simpler (e.g. MD5 hash the DIE
> bytes
> > > themselves) and avoid needing to use something like the DWARF type
> hashing
> > > algorithm for this?
> >
> > Where it can LLVM uses an MD5 sum over the mangled name of the type to
> avoid the expensive DWARF type hashing.
> >
> > Just a side node: Debug info fission is in the process of being
> standardized in DWARF 5. The progress can be tracked here:
> >   http://dwarfstd.org/Issues.php?type=closed4 (search for “fission” and
> “split DWARF”)
> > The DWARF specification may deviate from the description on the GCC wiki
> a little, but we probably want to follow the DWARF spec where that makes
> sense.
>
> I see. From http://dwarfstd.org/ShowIssue.php?issue=130313.4 :
>
> >   5.  A DW_AT_dwo_id attribute whose value is an 8-byte
> >       unsigned hash of the full compilation unit.  This hash
> >       value is computed by the method described in Section 7.27
> >       ("Type Signature Computation").
>
> That's unfortunate. Maybe I'm missing something, but I don't really see why
> the attribute value needs to be specified like that, as the hash is only
> used by consumers to match skeleton units with full units, and it should be
> up to the producer to use a good enough hash. Is there any possibility of
> changing the specification before DWARF 5 is standardised?
>

It has been, but it's not really important as any two compilers aren't
going to agree on debug info output anyhow so they're unlikely to match. I
raised this in committee at the time as a "may" rather than "is". That
said, recent text (I don't have an absolutely updated version in front of
me) reads:

A DW_AT_dwo_id attribute whose implementation-defined integer constant
value, known as the compilation unit ID, provides unique identification of
this compilation unit as well as the associated split compilation unit in
the object file named in the DW_AT_dwo_name attribute. For simplicity, the
DW_AT_dwo_id attributes in the skeleton compilation unit and the
corresponding split full compilation unit (see Section 3.1.3 on the
following page) must use the same form to encode this identification value.

The means of determining a compilation unit ID does not need to be similar
or related to the means of determining a type unit signature.

That said, ISTR at the time having a good reason why a hash of the
debug_info section wasn't a sufficient method for unique identifiers, but
my brain isn't coming up with it right now. Might have been related to not
having a finalized section before wanting to splat in the hash, but I'm not
sure.

For the type ids we haven't worried because we're only supporting type
units for C++ types at the moment, if we wanted to support C types we'd
want to come up with another identifier so we could merge similar types
across compilation units.

Hope this helps.

-eric
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20160220/9179d48a/attachment.html>