[PATCH] D33488: [ELF] - Optimization for populating stringpool when building .gdb_index.

Wed May 24 16:53:34 PDT 2017

George Rimar via Phabricator <reviews at reviews.llvm.org> writes:

> grimar created this revision.
> Herald added a subscriber: emaste.
>
> It is possible to speedup population of string pool just
> by reusing precalculated hash value.
>
> Tested llc binary to link, 50 runs each:
> Without patch, with --gdb-index: 7,826269134 seconds time elapsed ( +-  0,06% )
> With patch, with --gdb-index: 7,729554651 seconds time elapsed ( +- 0,11% )
> A / B == 1,0125, or at least 1% for free.
>
>
> https://reviews.llvm.org/D33488
>
> Files:
>   ELF/SyntheticSections.cpp
>
>
> Index: ELF/SyntheticSections.cpp
> ===================================================================
> --- ELF/SyntheticSections.cpp
> +++ ELF/SyntheticSections.cpp
> @@ -1771,7 +1771,7 @@
>  
>    for (std::pair<StringRef, uint8_t> &Pair : NamesAndTypes) {
>      uint32_t Hash = hash(Pair.first);
> -    size_t Offset = StringPool.add(Pair.first);
> +    size_t Offset = StringPool.add({Pair.first, Hash});

The hash function being used is

static uint32_t hash(StringRef Str) {
  uint32_t R = 0;
  for (uint8_t C : Str)
    R = R * 67 + tolower(C) - 113;
  return R;
}

What happens we have the string "Foo" and "foo"? Do we have a guarantee
that that never happens?

Cheers,
Rafael