[PATCH] D40697: Cache modulo values for the .gnu.hash section.

Fri Dec 1 14:54:19 PST 2017

Rui Ueyama <ruiu at google.com> writes:

> On Fri, Dec 1, 2017 at 2:10 PM, Rafael Avila de Espindola <
> rafael.espindola at gmail.com> wrote:
>
>> Rui Ueyama <ruiu at google.com> writes:
>>
>> > On Fri, Dec 1, 2017 at 1:41 PM, Rafael Avila de Espindola <
>> > rafael.espindola at gmail.com> wrote:
>> >
>> >> Rui Ueyama via Phabricator <reviews at reviews.llvm.org> writes:
>> >>
>> >> > ruiu updated this revision to Diff 125067.
>> >> > ruiu added a comment.
>> >> >
>> >> > - clang-format
>> >> >
>> >> > It seems this patch actually makes that test case 2% faster, which
>> >> > isn't a negligible improvement.
>> >>
>> >> Please benchmark more than clang. There is a set of different benchmarks
>> >> in https://s3-us-west-2.amazonaws.com/linker-tests/
>> lld-speed-test.tar.xz.
>> >
>> >
>> > We don't need to run all these benchmarks as this patch is a micro
>> > optimization. More precisely, in order to create a hash table for 22212
>> > symbols,
>> >
>> >  - GnuHashTableSection::addSymbols took 60 ms
>> >  - GnuHashTableSection::writeHashTable took 8 ms.
>> >
>> > With this patch it now takes 37 ms and 7 ms, respectively.
>>
>> I am uncomfortable LGTMing an optimization without a full
>> benchmark. Specially one that trades memory for speed.
>>
>
> There's no memory-to-CPU tradeoff. There was a 32-bit padding at end of
> `Entry` struct.

OK, that makes it more likely that this is useful, but I still think
we should fully benchmark every single optimization.

Cheers,
Rafael