[PATCH] D45571: [ELF] - Speedup MergeInputSection::splitStrings
George Rimar via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Wed Apr 25 05:32:50 PDT 2018
grimar added inline comments.
================
Comment at: ELF/InputSection.cpp:886
+ while (DataSize > sizeof(uint32_t)) {
+ uint32_t Word = *reinterpret_cast<const uint32_t *>(Data);
+ // This checks if at least one byte of a word is a null byte.
----------------
espindola wrote:
> This will produce different results on a big endian host, no?
You are right..
To solve this,
I tried to rewrite hashing code right below to read byte by byte in a loop, but that damages the performance too much.
Then tried both `read32be()` and `read32le()` (my host is LE). Avg time for them was the same and has no difference from `*reinterpret_cast<const uint32_t *>`, so it seems we can use it.
================
Comment at: ELF/InputSection.cpp:894
+ DataSize -= sizeof(uint32_t);
+ Hash = (Hash << 5) + Hash + Word;
+ }
----------------
espindola wrote:
> I don't know enough about hashing to judge if this is a reasonable extension of the djb hash for using 4 bytes at a time, but we can probably start with it.
I experimented here, any bad hashing increases hash collisions, what instantly shows up in the profile. My approach seems works well, so I think it is OK. It is simpler than taking single bytes, and also the loop reading bytes worked slower for me.
https://reviews.llvm.org/D45571
More information about the llvm-commits
mailing list