[PATCH] D45571: [ELF] - Speedup MergeInputSection::splitStrings

George Rimar via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Sun Apr 29 18:35:27 PDT 2018


grimar added a comment.

I tried reading both 4 and 8 bytes at one. Code for 8 bytes was:

  static inline size_t findSizeAndHash(StringRef S, uint64_t &Hash) {
    const char *Data = S.data();
    const char *const End = Data + S.size();
  
    // We are going to calculate simple hash based on DJB hash below. Hash is
    // calculated as the same time as we read the string bytes for speedup.
    uint64_t H = 5381;
  
    // Load a word at a time and test if any of bytes is 0-byte.
    while (End - Data > 8) {
      uint64_t Word = read64(Data);
      // This checks if at least one byte of a word is a null byte.
      // If we found such case we want to break the loop and continue
      // testing the single bytes to find the exact null byte position.
      if ((Word - 0x0101010101010101) & ~Word & 0x8080808080808080)
        break;
      Data += 8;
      H = H * 33 + Word;
    }
  
    // Now find the exact position of the null byte. Do not forget to
    // update the hash value too.
    while (End - Data) {
      H = H * 33 + *Data;
      if (!*Data) {
        Hash = H;
        return Data - S.data() + 1;
      }
      ++Data;
    }
  
    llvm_unreachable("string is not null terminated");
  }

It seems generally to work about 50ms slower than 4 bytes at ones (posted diff) for me, though
the difference is so minor sometimes that I am inclined to think it can be the calculation error probably.


https://reviews.llvm.org/D45571





More information about the llvm-commits mailing list