[PATCH] D64306: [clangd] Use xxhash instead of SHA1 for background index file digests.

Sam McCall via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Sun Jul 7 20:50:30 PDT 2019


sammccall created this revision.
sammccall added a reviewer: kadircet.
Herald added subscribers: llvm-commits, arphaman, jkorous, MaskRay, ilya-biryukov.
Herald added a project: LLVM.

Currently SHA1 is about 10% of our CPU, this patch reduces it to ~1%.

xxhash is a well-defined (stable) non-cryptographic hash optimized for
fast checksums (like crc32).
Collisions shouldn't be a problem, despite the reduced length:

- for actual file content (used to invalidate bg index shards), there are only two versions that can collide (new shard and old shard).
- for file paths in bg index shard filenames, we would need 2^32 files with the same filename to expect a collision. Imperfect hashing may reduce this a bit but it's well beyond what's plausible.

This will invalidate shards on disk (as usual; I bumped the version),
but this time the filenames are changing so the old files will stick
around :-( So this is more expensive than the usual bump, but would be
good to land before the v9 branch when everyone will start using bg index.


Repository:
  rL LLVM

https://reviews.llvm.org/D64306

Files:
  clang-tools-extra/clangd/SourceCode.cpp
  clang-tools-extra/clangd/SourceCode.h
  clang-tools-extra/clangd/index/Background.cpp
  clang-tools-extra/clangd/index/Background.h
  clang-tools-extra/clangd/index/BackgroundIndexStorage.cpp
  clang-tools-extra/clangd/index/Serialization.cpp
  clang-tools-extra/clangd/unittests/SerializationTests.cpp

-------------- next part --------------
A non-text attachment was scrubbed...
Name: D64306.208309.patch
Type: text/x-patch
Size: 4980 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20190708/c3ecb154/attachment.bin>


More information about the llvm-commits mailing list