[PATCH] llvm-cov: Updated file checksum to be timestamp.

Eric Christopher echristo at gmail.com
Sun Nov 17 06:43:18 PST 2013


On Nov 16, 2013 8:37 PM, "David Blaikie" <dblaikie at gmail.com> wrote:
>
>
>
>
> On Fri, Nov 15, 2013 at 11:03 PM, Eric Christopher <echristo at gmail.com>
wrote:
>>
>> On Fri, Nov 15, 2013 at 6:08 PM, Nick Lewycky <nlewycky at google.com>
wrote:
>> > On 15 November 2013 17:38, Robinson, Paul
>> > <Paul_Robinson at playstation.sony.com> wrote:
>> >>
>> >> > -----Original Message-----
>> >> > From: llvm-commits-bounces at cs.uiuc.edu [mailto:llvm-commits-
>> >> > bounces at cs.uiuc.edu] On Behalf Of Yuchen Wu
>> >> > Sent: Friday, November 15, 2013 4:59 PM
>> >> > To: Nick Lewycky; Bob Wilson
>> >> > Cc: llvm-commits at cs.uiuc.edu
>> >> > Subject: RE: [PATCH] llvm-cov: Updated file checksum to be
timestamp.
>> >> >
>> >> > >> 2. Using the output file itself to seed hash function, which
makes
>> >> > it
>> >> > >> deterministic. I've tried implementing this using the size of the
>> >> > >> output buffer and it was pretty simple. The problem with it,
however,
>> >> > >> is that there's a lot more chance for a change to the GCNO file
to go
>> >> > >> unnoticed. I also think that even if the source hadn't changed
>> >> > between
>> >> > >> compiles, the new binary files shouldn't be compatible with the
old.
>> >> > >
>> >> > > This is obviously the correct approach. In general, it's
important to
>> >> > > be able to have reproducible builds so that we can reproduce the
same
>> >> > > binaries from source, builds where outputs can be cached (for
instance
>> >> > > by modern non-make build systems that use the md5 of the output
>> >> > files),
>> >> > > etc. GCC's behaviour is silly and there's no need to replicate it.
>> >> > >
>> >> > >> "The problem with it, however, is that there's a lot more chance
for
>> >> > a
>> >> > >> change to the GCNO file to go unnoticed."
>> >> > >
>> >> > > What do you mean by this? Are you worried that things could go
into
>> >> > the
>> >> > > GCNO file without being an input to the hash function? The
checksum is
>> >> > > a safety measure to help people avoid accidentally putting
mismatching
>> >> > > GCNO and GCDA files together. Not having something be input to the
>> >> > hash
>> >> > > is the safe failure. We don't want the checksum to change if other
>> >> > > parts of the GCNO file weren't modified.
>> >> >
>> >> > What I meant by the last statement was that if you are doing
something
>> >> > like hashing the size of the file to compute a checksum, there is a
much
>> >> > higher chance that you may be using a GCNO file generated from a
>> >> > different source that just happens to be the same size. Obviously
that
>> >> > was just an example, so if you guys came across a better way to
seed the
>> >> > hash for Google's gcc checksum, I'd be happy to hear it :)
>> >>
>> >> Can we use an MD5 of the source file here? (Not having looked at the
>> >> patch, sorry...) The only reason I ask is that there's a DWARF 5
feature
>> >> to use MD5 instead of timestamps in the debug-line info, so computing
an
>> >> MD5 of the source files is something we'll want to do anyway,
eventually.
>> >
>> >
>> > That's reasonable, but I'd prefer to have the .gcno's checksum not
depend on
>> > things which aren't in the .gcno. Fixing a typo in a comment for
instance
>> > produces the same .o file and I'd like it to produce the same .gcno
file.
>> >
>> > That applies to DWARF too. I hope we haven't standardized something
that
>> > requires us to emit a different .o file just because of a typo fix in a
>> > comment.
>> >
>>
>> It's not finalized yet :)
>
>
> But the current spec (defining package hashing in terms of type hashing)
doesn't include line numbers and a bunch of other stuff in the hash - so
no, fixing a typo in a comment won't change the package hash.
>

Yeah, this is a dwarf 5 proposal from Paul that he was bringing up.

http://www.dwarfstd.org/ShowIssue.php?issue=130701.1

Which looks like it is going to have this issue. Paul?

--eric

> (I'm not sure it's right not to include line info, though - that seems
sort of important to regenerate the debug info)
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20131117/c3ac8a43/attachment.html>


More information about the llvm-commits mailing list