[all-commits] [llvm/llvm-project] 48e93f: [Support] Add llvm::xxh3_64bits

Tue Jul 18 13:36:26 PDT 2023

  Branch: refs/heads/main
  Home:   https://github.com/llvm/llvm-project
  Commit: 48e93f57f1ee914ca29aa31bf2ccd916565a3610
      https://github.com/llvm/llvm-project/commit/48e93f57f1ee914ca29aa31bf2ccd916565a3610
  Author: Fangrui Song <i at maskray.me>
  Date:   2023-07-18 (Tue, 18 Jul 2023)

  Changed paths:
    M llvm/include/llvm/Support/xxhash.h
    M llvm/lib/Support/xxhash.cpp
    M llvm/unittests/Support/xxhashTest.cpp

  Log Message:
  -----------
  [Support] Add llvm::xxh3_64bits

ld.lld SHF_MERGE|SHF_STRINGS duplicate elimination is computation heavy
and utilitizes llvm::xxHash64, a simplified version of XXH64.
Externally many sources confirm that a new variant XXH3 is much faster.

I have picked a few hash implementations and computed the
proportion of time spent on hashing in the overall link time (a debug
build of clang 16 on a machine using AMD Zen 2 architecture):

* llvm::xxHash64: 3.63%
* official XXH64 (`#define XXH_VECTOR XXH_SCALAR`): 3.53%
* official XXH3_64bits (`#define XXH_VECTOR XXH_SCALAR`): 1.21%
* official XXH3_64bits (default, essentially `XXH_SSE2`): 1.22%
* this patch llvm::xxh3_64bits: 1.19%

The remaining part of lld remains unchanged. Consequently, a lower ratio
indicates that hashing is faster. Therefore, it is evident that XXH3 from xxhash
is significantly faster than both the official version and our llvm::xxHash64.

(
string length: count
1-3: 393434
4-8: 2084056
9-16: 2846249
17-128: 5598928
129-240: 1317989
241-: 328058
)

This patch adds heavily simplified https://github.com/Cyan4973/xxHash,
taking account of many simplification ideas from Devin Hussey's xxhash-clean.

Important x86-64 optimization ideas:

* Make XXH3_len_129to240_64b and XXH3_hashLong_64b noinline
* Unroll XXH3_len_17to128_64b
* __restrict does not affect Clang code generation

Beside SHF_MERGE|SHF_STRINGS duplicate elimination, llvm/ADT/StringMap.h
StringMapImpl::LookupBucketFor and a few places in lld can potentially be
accelerated by switching to llvm::xxh3_64bits.

Link: https://github.com/llvm/llvm-project/issues/63750

Reviewed By: serge-sans-paille

Differential Revision: https://reviews.llvm.org/D154812