[PATCH] D45571: [ELF] - Speedup MergeInputSection::splitStrings
Rui Ueyama via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Fri Apr 27 14:13:06 PDT 2018
ruiu added inline comments.
================
Comment at: ELF/InputSection.cpp:872
+ // calculated as the same time as we read the string bytes for speedup.
+ uint32_t Hash = 5381;
+
----------------
I'd name this `H`.
================
Comment at: ELF/InputSection.cpp:875
+ // Load a word at a time and test if any of bytes is 0-byte.
+ while (DataSize > sizeof(uint32_t)) {
+ uint32_t Word = read32(Data);
----------------
sizeof(uint32_t) is always 4, so please write just 4.
================
Comment at: ELF/InputSection.cpp:880
+ // testing the single bytes to find the exact null byte position.
+ if (((Word - 0x01010101) & ~Word & 0x80808080) != 0)
+ break;
----------------
Omit `!= 0` because it is always implied.
================
Comment at: ELF/InputSection.cpp:883
+ Data += sizeof(uint32_t);
+ DataSize -= sizeof(uint32_t);
+ Hash = (Hash << 5) + Hash + Word;
----------------
I don't think you need to update DataSize. You know the end position of S, so you can compare Data with it to see if you've reached end of a string.
================
Comment at: ELF/InputSection.cpp:884
+ DataSize -= sizeof(uint32_t);
+ Hash = (Hash << 5) + Hash + Word;
+ }
----------------
`* 33` is perhaps better for brevity.
================
Comment at: ELF/InputSection.cpp:889
+ // update the hash value too.
+ while (DataSize--) {
+ Hash = (Hash << 5) + Hash + *Data;
----------------
Likewise, maintaining both DataSize and Data doesn't seem to make sense to me.
================
Comment at: ELF/InputSection.cpp:896
+
+ llvm_unreachable("string is not null terminated");
+}
----------------
You should be able to handle this case gracefully.
================
Comment at: ELF/InputSection.cpp:901
// null-terminated strings.
void MergeInputSection::splitStrings(ArrayRef<uint8_t> Data, size_t EntSize) {
+ if (Data.size() < EntSize || Data.back() != 0)
----------------
Split this entire function into two; one for '\0'-terminated string and the other for multi-word string. That's better than doing it inside the while loop for readability.
================
Comment at: ELF/InputSection.cpp:914-916
+ uint64_t SizeHash = findSizeAndHash(S);
+ Size = SizeHash & 0xFFFFFFFF;
+ Hash = SizeHash >> 32;
----------------
This is very odd, and we should avoid returning two values in one word. Since `findSizeAndHash` is inlined, I don't think you need this.
https://reviews.llvm.org/D45571
More information about the llvm-commits
mailing list