[PATCH] D17321: DIEData, DIEWriter: introduce and begin migration.
David Blaikie via llvm-commits
llvm-commits at lists.llvm.org
Mon Feb 22 14:34:53 PST 2016
On Mon, Feb 22, 2016 at 2:11 PM, Robinson, Paul <
Paul_Robinson at playstation.sony.com> wrote:
> Agreed - further, having compilation of the "same" source having different
> IDs is an anti-goal, it would mean non-reproducible/stable builds, which we
> certainly don't want.
>
>
>
> Hmm, that seems worthwhile.
>
>
> Ideally, the DWO ID wouldn't change if you change text in a comment, even,
> or the name of a macro (if you don't have macro debug info enabled - though
> currently it would not change even if you did have macro debug info
> enabled, which is unfortunate). That way, an optimal build system would
> rebuild the .o/.dwo, notice that they are identical to the previous ones,
> and not rebuild the linked executable or the dwp (if any).
>
>
>
> Well, seems like the DWO ID wouldn't *need* to change even if you made
> moderate code changes (as long as the types and variables and function
> signatures and set of template instantiations etc didn't change). As long
> as the .debug_addr *indexes* didn't change, for example, it doesn't matter
> if the *content* of the .debug_addr section (in the .o file) did change. A
> hash of the content of the *.dwo sections would DTRT here.
>
Right
I think at the time we felt we had a circular problem when it came to
hashing the entirety of the .dwo sections - no easy way in the MC layers to
go & patch up the DWO ID into the .dwo sections you just hashed, etc. But I
don't think that was the only problem - not sure, though.
- Dave
> --paulr
>
>
>
>
>
> *From:* David Blaikie [mailto:dblaikie at gmail.com]
> *Sent:* Sunday, February 21, 2016 4:42 PM
> *To:* Eric Christopher
> *Cc:* Robinson, Paul; Peter Collingbourne; Adrian Prantl;
> llvm-commits at lists.llvm.org
>
> *Subject:* Re: [PATCH] D17321: DIEData, DIEWriter: introduce and begin
> migration.
>
>
>
>
>
>
>
> On Fri, Feb 19, 2016 at 7:06 PM, Eric Christopher via llvm-commits <
> llvm-commits at lists.llvm.org> wrote:
>
>
>
> On Fri, Feb 19, 2016 at 6:52 PM Robinson, Paul <
> Paul_Robinson at playstation.sony.com> wrote:
>
> Type units are COMDATs produced across (potentially) many compilations, so
> the key has to correspond to the content in a predictable/consistent way in
> order for the linker to throw away duplicates. Therefore the type-unit hash
> is specified the way it is.
>
>
>
> Doesn't matter. The way that we're doing it in clang is perfectly
> acceptable for this.
>
>
>
> Ish - there are tradeoffs for sure. The way we're doing this in Clang
> means that if the user violates the ODR they get a weirder experience in
> the debugger than not. But we decided to make that tradeoff to
> simplify/optimize/etc the implementation.
>
>
>
>
>
> dwo_ids are just identifiers matching two lumps of data for the same CU,
> produced by the same compilation, so they aren't sensitive to content (at
> least not in the same way as type units). But it would seem prudent for
> two compilations of the "same" source to have different IDs, thus hashing
> the .debug_info content might not be the best choice; a more random-number
> kind of ID would be more appropriate.
>
>
>
> I don't see this either, but I know where you're coming from.
>
>
>
> Agreed - further, having compilation of the "same" source having different
> IDs is an anti-goal, it would mean non-reproducible/stable builds, which we
> certainly don't want.
>
> Ideally, the DWO ID wouldn't change if you change text in a comment, even,
> or the name of a macro (if you don't have macro debug info enabled - though
> currently it would not change even if you did have macro debug info
> enabled, which is unfortunate). That way, an optimal build system would
> rebuild the .o/.dwo, notice that they are identical to the previous ones,
> and not rebuild the linked executable or the dwp (if any).
>
>
>
>
>
> -eric
>
>
>
> --paulr
>
>
>
> *From:* llvm-commits [mailto:llvm-commits-bounces at lists.llvm.org] *On
> Behalf Of *Eric Christopher via llvm-commits
> *Sent:* Friday, February 19, 2016 6:39 PM
> *To:* Peter Collingbourne; Adrian Prantl
> *Cc:* llvm-commits at lists.llvm.org
> *Subject:* Re: [PATCH] D17321: DIEData, DIEWriter: introduce and begin
> migration.
>
>
>
> On Fri, Feb 19, 2016 at 5:45 PM Peter Collingbourne via llvm-commits <
> llvm-commits at lists.llvm.org> wrote:
>
> On Fri, Feb 19, 2016 at 03:47:22PM -0800, Adrian Prantl wrote:
> >
> > > On Feb 19, 2016, at 12:31 PM, Peter Collingbourne via llvm-commits <
> llvm-commits at lists.llvm.org> wrote:
> > >
> > > them incorrectly, see DwarfDebug::makeTypeSignature). Since the page at
> > > https://gcc.gnu.org/wiki/DebugFission does not specify how the hash
> is to be
> > > computed, could we maybe do something simpler (e.g. MD5 hash the DIE
> bytes
> > > themselves) and avoid needing to use something like the DWARF type
> hashing
> > > algorithm for this?
> >
> > Where it can LLVM uses an MD5 sum over the mangled name of the type to
> avoid the expensive DWARF type hashing.
> >
> > Just a side node: Debug info fission is in the process of being
> standardized in DWARF 5. The progress can be tracked here:
> > http://dwarfstd.org/Issues.php?type=closed4 (search for “fission” and
> “split DWARF”)
> > The DWARF specification may deviate from the description on the GCC wiki
> a little, but we probably want to follow the DWARF spec where that makes
> sense.
>
> I see. From http://dwarfstd.org/ShowIssue.php?issue=130313.4 :
>
> > 5. A DW_AT_dwo_id attribute whose value is an 8-byte
> > unsigned hash of the full compilation unit. This hash
> > value is computed by the method described in Section 7.27
> > ("Type Signature Computation").
>
> That's unfortunate. Maybe I'm missing something, but I don't really see why
> the attribute value needs to be specified like that, as the hash is only
> used by consumers to match skeleton units with full units, and it should be
> up to the producer to use a good enough hash. Is there any possibility of
> changing the specification before DWARF 5 is standardised?
>
>
>
> It has been, but it's not really important as any two compilers aren't
> going to agree on debug info output anyhow so they're unlikely to match. I
> raised this in committee at the time as a "may" rather than "is". That
> said, recent text (I don't have an absolutely updated version in front of
> me) reads:
>
>
>
> A DW_AT_dwo_id attribute whose implementation-defined integer constant
> value, known as the compilation unit ID, provides unique identification of
> this compilation unit as well as the associated split compilation unit in
> the object file named in the DW_AT_dwo_name attribute. For simplicity, the
> DW_AT_dwo_id attributes in the skeleton compilation unit and the
> corresponding split full compilation unit (see Section 3.1.3 on the
> following page) must use the same form to encode this identification value.
>
>
>
> The means of determining a compilation unit ID does not need to be similar
> or related to the means of determining a type unit signature.
>
>
>
> That said, ISTR at the time having a good reason why a hash of the
> debug_info section wasn't a sufficient method for unique identifiers, but
> my brain isn't coming up with it right now. Might have been related to not
> having a finalized section before wanting to splat in the hash, but I'm not
> sure.
>
>
>
> For the type ids we haven't worried because we're only supporting type
> units for C++ types at the moment, if we wanted to support C types we'd
> want to come up with another identifier so we could merge similar types
> across compilation units.
>
>
>
> Hope this helps.
>
>
>
> -eric
>
>
> _______________________________________________
> llvm-commits mailing list
> llvm-commits at lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-commits
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20160222/5cf9596c/attachment.html>
More information about the llvm-commits
mailing list