[llvm-bugs] [Bug 35914] New: lld needs to set the link timeStamp on Windows builds, probably to a hash of the binary
via llvm-bugs
llvm-bugs at lists.llvm.org
Thu Jan 11 11:36:56 PST 2018
https://bugs.llvm.org/show_bug.cgi?id=35914
Bug ID: 35914
Summary: lld needs to set the link timeStamp on Windows builds,
probably to a hash of the binary
Product: lld
Version: unspecified
Hardware: PC
OS: Windows NT
Status: NEW
Severity: release blocker
Priority: P
Component: COFF
Assignee: unassignedbugs at nondot.org
Reporter: brucedawson at chromium.org
CC: llvm-bugs at lists.llvm.org
lld currently sets the link-time-stamp to zero, rather than to the link time,
in order to support reproducible builds (builds where the results depend only
on the inputs, not on the time or machine used to do the build).
This is a worthy goal but this solution to the reproducible build problem is
*not* practical. It will completely break symbol servers.
Symbol servers are a vital tool for Windows developers and they are used not
just to archive PDB files but also to archive PE files (.exe and .dll). The
format of the paths used is documented here:
https://randomascii.wordpress.com/2013/03/09/symbols-the-microsoft-way/
In particular note that the path for a PE file on a symbol server is generated
like this:
ā%s\%s\%s%s\%sā % (serverName, peName, timeStamp, imageSize, peName)
The peName and serverName can be considered to be unchanging which leaves the
timeStamp and imageSize to identify a particular binary. The imageSize is often
rounded to a page size so there are likely to be many similar builds which have
exactly the same page size.
So that leaves timeStamp as often the *only* differentiator between builds.
Therefore, setting the timeStamp field to a constant (zero or anything else) is
simply not tenable.
Starting with Windows 10 Microsoft has been creating reproducible builds which
is why the timeStamp field in Windows 10 binaries shows dates from seemingly
random years. The new implementation of this field is discussed here:
https://blogs.msdn.microsoft.com/oldnewthing/20180103-00/?p=97705
Roughly speaking the timeStamp field now contains some sort of hash of the
binary. This needs to be a reasonably good hash in order to reduce the odds of
collisions. A cryptographically secure hash would be overkill, especially given
that the result has to be packed down into a 32-bit value.
It should be possible to use the same idea to generate the GUID/age field for
identifying the PDB, in order to avoid this source of non-determinism in
builds.
Using a hash for the timeStamp and GUID/age would be compatible with the
buildID concept on Linux.
Any versions of lld that ignore this are incompatible with symbol servers and
therefore incompatible with "real" Windows development.
--
You are receiving this mail because:
You are on the CC list for the bug.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-bugs/attachments/20180111/390340ea/attachment.html>
More information about the llvm-bugs
mailing list