[PATCH] D41993: [ELF] - Change shift2 constant of GNU_HASH from 6->11.

Rui Ueyama via llvm-commits llvm-commits at lists.llvm.org
Tue Jan 16 17:43:11 PST 2018


I feel like this change looks a bit too arbitrary. If we can prove that it
is in general an improvement, I'm fine with this change, but even George's
experiment shows that 11 may not be an optimal number.

Probably what we should do is to try different shift2 numbers to see if
which one yields the best result on the fly. I'll try to see if this isn't
too slow.

On Tue, Jan 16, 2018 at 11:13 AM, Rafael EspĂ­ndola <
rafael.espindola at gmail.com> wrote:

> I have mixed feelings about this.
>
> I can reproduce an improvement with 11:
>
> shift=6
> 160.683939775 seconds time elapsed
>      ( +-  0.18% )
> 161.236797654 seconds time elapsed
>      ( +-  0.30% )
>
> shift=11
> 157.118922858 seconds time elapsed
>      ( +-  0.10% )
> 158.837348716 seconds time elapsed
>      ( +-  0.18% )
>
> But there is still a somewhat large variation between perf runs.
>
> This also feels a bit magical, but maybe it makes perfect sense for
> someone familiar with hash tables and bloom filters.
>
> So I guess I am OK with the patch if it includes a comment saying that
> the value was found experimentally.
>
> Rui, what do you think?
>
> Cheers,
> Rafael
>
>
> On 15 January 2018 at 06:52, George Rimar <grimar at accesssoftek.com> wrote:
> >>Do you know why this produces a better bloom filter?
> >>
> >
> > I think so. My thoughts are below.
> >
> > Bloom filter bits are calculated as:
> > H1 = dl_new_hash(name);
> > H2 = H1 >> shift2;
> > BITMASK = (1 << (H1 % C)) | (1 << (H2 % C));
> > bloom[N] |= BITMASK;
> > (sample taken from https://blogs.oracle.com/ali/gnu-hash-elf-sections).
> >
> > As far I understand we ideally should archieve next thing when writing
> such filter:
> > (using out code now)
> >
> > We apply bit 1 at first:
> > Val |= uint64_t(1) << (Sym.Hash % C);
> >
> > Then bit 2:
> > Val |= uint64_t(1) << ((Sym.Hash >> getShift2()) % C);
> >
> > I believe idea here is that we would like to find such shift2 constant
> that applying
> > bit 2 to Val should change Val as often as possible,
> > (Val | Bit1) ideally should be different from (Val | Bit1 | Bit2).
> > So we want to use as much different bits as possible in bloom filter
> overall.
> >
> > That was why I tried to play with Shift2 initially.
> >
> > Today I wrote simple test. It generates N symbols with random name of
> random length.
> > Then calculates Score for each Shift2 possible, where Score is amount of
> times where setting of Bit2
> > changed the bloom filter entry value. (patch is attached).
> > So idea was to find Shift2 so that Score is maximum.
> >
> > Results looks a bit strange for me:
> > [Shift2] | [Score]
> > [0]  -> [0]
> > [1]  -> [8338]
> > [2]  -> [7762]
> > [3]  -> [6736]
> > [4]  -> [5281]
> > [5]  -> [3541]
> > [6]  -> [1995]
> > [7]  -> [1993]
> > [8]  -> [1992]
> > [9]  -> [1991]
> > [10] -> [1995]
> > [11] -> [1985]
> > [12] -> [3501]
> > [13] -> [5135]
> > [14] -> [6402]
> > [15] -> [7158]
> > [16] -> [7640]
> > [17] -> [7866]
> > [18] -> [7828]
> > [19] -> [7820]
> > [20] -> [7823]
> > [21] -> [7689]
> > [22] -> [7712]
> > [23] -> [7715]
> > [24] -> [7608]
> > [25] -> [7591]
> > [26] -> [7556]
> > [27] -> [7156]
> > [28] -> [6788]
> > [29] -> [6010]
> > [30] -> [5014]
> >
> > So according to them, there is almost no difference between Shift2==6
> and Shift2==11,
> > though Shift2==12 already shows significant difference. Best results are
> usually in Shift2 = [15..20] and at [1].
> >
> > My patch changed value to 11. According to test above that should give
> no effect, because 11 != 12,
> > but I think that random symbol names are just to far from real live
> names used in LLVM,
> > and so probably nothing wrong with that 11 shows good results for
> check-llvm calls I tried.
> >
> > Earlier (I mentioned at bug page) I also observer good result with
> Shift2=14. We probably could use it
> > or some other good value in [11..20] instead. I think any of them work
> much better than current value 6 used.
> >
> > George.
> >
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20180116/7806be62/attachment-0001.html>


More information about the llvm-commits mailing list