[PATCH] D41993: [ELF] - Change shift2 constant of GNU_HASH from 6->11.

George Rimar via llvm-commits llvm-commits at lists.llvm.org
Tue Jan 16 08:08:25 PST 2018


Also experimented a bit with run time evaluation of best Shift2 based on approach mentioned earlier.
(patch in attachment, it calculates score for all Shift2 in [0..31] and selects the best one for .gnu_hash table).

I am not suggesting doing that in runtime btw, just want to share that it works pretty good it seems.
It gave significant boost for llvm-check, just seems like hardcoded values >= 11 shows.

Best regards,
George | Developer | Access Softek, Inc

________________________________________
От: George Rimar
Отправлено: 15 января 2018 г. 17:52
Кому: Rafael Avila de Espindola; rafael.espindola+phab at gmail.com; ruiu at google.com
Копия: emaste at freebsd.org; Evgeny Leviant; Igor Kudrin; llvm-commits at lists.llvm.org; reviews+D41993+public+e0db5be8a5b1ab38 at reviews.llvm.org
Тема: Re: [PATCH] D41993: [ELF] - Change shift2 constant of GNU_HASH from 6->11.

>Do you know why this produces a better bloom filter?
>

I think so. My thoughts are below.

Bloom filter bits are calculated as:
H1 = dl_new_hash(name);
H2 = H1 >> shift2;
BITMASK = (1 << (H1 % C)) | (1 << (H2 % C));
bloom[N] |= BITMASK;
(sample taken from https://blogs.oracle.com/ali/gnu-hash-elf-sections).

As far I understand we ideally should archieve next thing when writing such filter:
(using out code now)

We apply bit 1 at first:
Val |= uint64_t(1) << (Sym.Hash % C);

Then bit 2:
Val |= uint64_t(1) << ((Sym.Hash >> getShift2()) % C);

I believe idea here is that we would like to find such shift2 constant that applying
bit 2 to Val should change Val as often as possible,
(Val | Bit1) ideally should be different from (Val | Bit1 | Bit2).
So we want to use as much different bits as possible in bloom filter overall.

That was why I tried to play with Shift2 initially.

Today I wrote simple test. It generates N symbols with random name of random length.
Then calculates Score for each Shift2 possible, where Score is amount of times where setting of Bit2
changed the bloom filter entry value. (patch is attached).
So idea was to find Shift2 so that Score is maximum.

Results looks a bit strange for me:
[Shift2] | [Score]
[0]  -> [0]
[1]  -> [8338]
[2]  -> [7762]
[3]  -> [6736]
[4]  -> [5281]
[5]  -> [3541]
[6]  -> [1995]
[7]  -> [1993]
[8]  -> [1992]
[9]  -> [1991]
[10] -> [1995]
[11] -> [1985]
[12] -> [3501]
[13] -> [5135]
[14] -> [6402]
[15] -> [7158]
[16] -> [7640]
[17] -> [7866]
[18] -> [7828]
[19] -> [7820]
[20] -> [7823]
[21] -> [7689]
[22] -> [7712]
[23] -> [7715]
[24] -> [7608]
[25] -> [7591]
[26] -> [7556]
[27] -> [7156]
[28] -> [6788]
[29] -> [6010]
[30] -> [5014]

So according to them, there is almost no difference between Shift2==6 and Shift2==11,
though Shift2==12 already shows significant difference. Best results are usually in Shift2 = [15..20] and at [1].

My patch changed value to 11. According to test above that should give no effect, because 11 != 12,
but I think that random symbol names are just to far from real live names used in LLVM,
and so probably nothing wrong with that 11 shows good results for check-llvm calls I tried.

Earlier (I mentioned at bug page) I also observer good result with Shift2=14. We probably could use it
or some other good value in [11..20] instead. I think any of them work much better than current value 6 used.

George.

-------------- next part --------------
A non-text attachment was scrubbed...
Name: patch.patch
Type: application/octet-stream
Size: 130306 bytes
Desc: patch.patch
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20180116/06f88d21/attachment-0001.obj>


More information about the llvm-commits mailing list