[PATCH] D41993: [ELF] - Change shift2 constant of GNU_HASH from 6->11.

Wed Jan 17 15:24:00 PST 2018

Please take a look at https://reviews.llvm.org/D42204. This patch includes
IMO a better way of measuring the performance of the symbol name
resolution. Running `ninja check-llvm` is too noisy because it runs a lot
of tests that are not related to the dynamic symbol resolution.

On Wed, Jan 17, 2018 at 1:50 PM, Rui Ueyama <ruiu at google.com> wrote:

> On Wed, Jan 17, 2018 at 7:07 AM, George Rimar <grimar at accesssoftek.com>
> wrote:
>
>> >That said, I don't think this patch makes much sense. There's no theory
>> behind it to explain why this could make things faster (it shouldn't as
>> long as the hash >function generates evenly distributed hash value). I
>> think what we should do is
>>
>>
>> I'll explain my theory at the end.
>>
>>
>> >
>> >1. first, verify that our bloom filter is really correct by taking a
>> look at the output binary carefully, because even if the bloom filter is
>> wrong, that doesn't cause any >correctness issue, but instead it just makes
>> things slower, and then
>>
>> That is not entirely correct. If you put few excessive bits to bloom
>> filter, nothing too bad will happen,
>> you'll increase the amount of false positives, in that case you're right, but
>> if you forget to set proper bit, things will most likely just stop working.
>> It can be checked simple - I changed the position of one of bits in our
>>  filter to break it,
>> and instantly faced with "symbol lookup error" from loader.
>>
>> So I think our filter is correct. Not only basing on above, but also I
>> debugged gold, I tweaked LLD settings to be equal with gold's for sample I
>> used
>> and compared what we do. (sample test.s I used and LLD patch are in
>> attachment, it is based on one of our tests + had to add few symbols that
>> gold adds to .dynsym, for gold had to add --noinhibit-exec to produce
>> the output).
>>
>> Even with the fact that gold uses slightly different order of filling
>> bloom filter (we use list of symbols that is already sorted by bucket ids,
>> gold does not,
>> what breaks the order of filling, though symbols in the final output are
>> still sorted fine), at the end I had 2 equal bloom filters.
>>
>> What was different in output is a most of hash table part of .gnu_hash. I
>> had no chance to investigate it deeply yet. I guess it is relative to
>> symbols order too,
>> but probably worth to check to be sure we are not broken here. Have plans
>> to do that.
>>
>> >2. understand what's the difference between our output and GNU linker's
>> output.
>> >
>> >I believe just changing parameters and benchmaking them is not a good
>> strategy. I want to understand it first before getting some optimized
>> parameters based on >experiments.
>>
>> I think worth to mention following thing. When we fill bloom filter
>> words, we calculate index of word as
>> size_t I = (Sym.Hash / C) & (MaskWords - 1);
>> (https://github.com/llvm-mirror/lld/blob/master/ELF/Syntheti
>> cSections.cpp#L1746)
>>
>> It looks correct (and FWIW gold do the same logic), but noticable thing
>> here is that even when we have
>> symbols that all has different Hash, left part which is "Sym.Hash / C"
>> breaks things a bit, so that in a list below:
>>
>> Name: __bss_start, Hash: 475558360, Hash/C: 7430599
>> Name: _end, Hash: 209000339, Hash/C: 32656270
>> Name: sym1, Hash: 209074775, Hash/C: 32667840
>> Name: sym2, Hash: 209074776, Hash/C: 32667840
>> Name: _edata, Hash: 397399875, Hash/C: 62084373
>> Name: sym3, Hash: 209074777, Hash/C: 32667840
>> Name: sym4, Hash: 209074778, Hash/C: 32667840
>> Name: sym5, Hash: 209074779, Hash/C: 32667840
>> Name: sym10, Hash: 27500887, Hash/C: 4296904
>> Name: sym6, Hash: 209074780, Hash/C: 32667840
>> Name: sym11, Hash: 27500888, Hash/C: 4296904
>> Name: sym7, Hash: 209074781, Hash/C: 32667840
>> Name: sym12, Hash: 27500889, Hash/C: 4296904
>> Name: sym8, Hash: 209074782, Hash/C: 32667840
>> Name: sym13, Hash: 27500890, Hash/C: 4296904
>> Name: sym9, Hash: 209074783, Hash/C: 32667840
>> Name: sym14, Hash: 27500891, Hash/C: 4296904
>> Name: sym15, Hash: 27500892, Hash/C: 4296904
>> Name: sym16, Hash: 27500893, Hash/C: 4296904
>>
>> only __bss_start and _edata has I == 1, all other symbols has I == 0. I
>> think that shows that fill of bloom filter can be
>> very depenent on symbol names.
>>
>> My theory was based on tweaking Shift2 so that it should set the
>> new second bit as often as possible.
>> It was invented basing on following idea. Imaging we have some symbols
>> "A", "B", "C".
>>
>> 1) Lets assume we placed A to bloom filter and it became 11000000.
>> 2) Lets try to lookup B. We should fail. If B has first bit equal to
>> 10000000, then ideal situation for us is that B should have
>> second bit different from 01000000, because we do not want to have false
>> positive during lookup.
>> 3) So we want to place B, basing on (2), we want to setup Shift2 so that
>> it will place second bit on a free space,
>> like result can be 11100000.
>> 4) Repeat 1-3 with C. We should have something like 11110000 now.
>>
>> It shows it can be reasonable to fill as much bits as possible. It is
>> probably not too strong heuristics,
>> but it shows some good results.
>>
>
> I'm not sure I understood your point correctly, but I didn't say that
> changing the Shift2 value wouldn't change the bloom filter. What I wanted
> to say is that, in theory, no Shift2 value is statistically better than
> other Shift2 value as long as the hash function doesn't produce biased hash
> values and a Shift2 value is large enough (greater than 6 on 64-bit, for
> example.)
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20180117/5316d984/attachment.html>