[PATCH] D56229: [PECOFF] Implementation of ObjectFilePECOFF:: GetUUID()

Pavel Labath via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Wed Apr 10 00:41:18 PDT 2019


labath added a comment.
Herald added a project: LLDB.

In D56229#1346941 <https://reviews.llvm.org/D56229#1346941>, @zturner wrote:

> Well, I guess I would ask what you want to do with the GUID?  If you want to match it to a debug information file, then the Debug Directory is the correct way to do that, and using a hash of the file path will not even be helpful.
>
> Another option would be to check for a debug directory of type `IMAGE_DEBUG_TYPE_REPRO`, and if that exists, then it means that the COFF timestamp is a hash of the binary, so it should be stable.
>
> If neither of these is present, then I think we should simply return `false` from this function and not mislead the caller.  The caller might wish to use special logic if the function returns false that says "if I couldn't get a UUID from the file, then try hashing the path and doing some kind of lookup based on that", but I don't think that should be part of this function.


I agree with the above statement. A very long time ago (for reasons which are not very important now), we decided to return a crc checksum as the UUID for ELF files without a build-id, and that's a decision that's been haunting us ever since. I think it would be good to not repeat the same mistake for COFF files. The problems with the crc elf checksum are:

- it does not help (in fact, it gets in the way) of matching build-id-less elf files, because both files appear to have a valid-but-different UUID
- checksumming whole files is slow, and we regularly get bug reports about lldb being slow from people whose default config does not include build-ids

So I'd propose to have this function return just the debug directory GUID, if it is there. Possibly the coff timestamp could be used as a fallback, but I don't know enough about windows/coff to say for sure. The case when we need to verify whether we have a local copy of a module for the remote debugging scenario can be handled at a different level. E.g., it already looks like the `qModuleInfo` packet we use does differentiate to some level the "uuid" and "md5" case https://github.com/llvm-mirror/lldb/blob/master/source/Plugins/Process/gdb-remote/GDBRemoteCommunicationClient.cpp#L3539. Right now they are both used to fill in the UUID field of the module spec, but we could change that and treat each field slightly differently. For matching exes, pdbs and minidumps we would just use the UUID field, while for checking object identity we would also fall back to the md5 checksum if the UUID field is not present.


Repository:
  rLLDB LLDB

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D56229/new/

https://reviews.llvm.org/D56229





More information about the llvm-commits mailing list