[PATCH] D128407: Undefined behaviour in Support/DJB.h

Troels F. Rønnow via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Thu Jun 23 01:25:18 PDT 2022


troelsfr created this revision.
Herald added a project: All.
troelsfr requested review of this revision.
Herald added a project: LLVM.
Herald added a subscriber: llvm-commits.

When LLVM is compiled with undefined behaviour sanitation, the hash function in `DJB.h` causes following error:

  sh
  external/llvm-project/llvm/include/llvm/Support/DJB.h:23:12: runtime error: left shift of 4086722279 by 5 places cannot be represented in type 'uint32_t' (aka 'unsigned int')
  ISSUE: UndefinedBehaviorSanitizer: undefined-behavior external/llvm-project/llvm/include/llvm/Support/DJB.h:23:12

The issue can be reproduced (and solution verified) using a standalone version of the hash function:

  c++
  #include <iostream>
  #include <cstdint>
  #include <string>
    
   inline uint32_t djbHash(std::string Buffer, uint32_t H = 5381) {
     for (unsigned char C : Buffer)
       H = (H  << 5ull) + H + C;
     return H;
   }
    
  inline uint32_t djbHashFixed(std::string Buffer, uint64_t H = 5381) {
    for (unsigned char C : Buffer)
      H = ((H & 0xffffffff) << 5ull) + H + C;
    return static_cast<uint32_t>(H);
  }
  
  int main()
  {
      std::cout << djbHashFixed("Hello world") << std::endl;;
      std::cout << djbHash("Hello world") << std::endl;;
      return 0;
  }

Compiling with:

  sh
  clang++ --std=c++17 -fsanitize=address,integer,undefined -fno-omit-frame-pointer -fno-sanitize-recover=all -g -O0 test.cpp -o test

shows the error occurring in the original hash function, but not in the fixed one:

  % ./test
  2310742177
  test.cpp:7:14: runtime error: left shift of 193458846 by 5 places cannot be represented in type 'uint32_t' (aka 'unsigned int')
  ISSUE: UndefinedBehaviorSanitizer: undefined-behavior test.cpp:7:14 in
  zsh: abort      ./test




https://reviews.llvm.org/D128407

Files:
  llvm/include/llvm/Support/DJB.h


Index: llvm/include/llvm/Support/DJB.h
===================================================================
--- llvm/include/llvm/Support/DJB.h
+++ llvm/include/llvm/Support/DJB.h
@@ -18,10 +18,10 @@
 namespace llvm {
 
 /// The Bernstein hash function used by the DWARF accelerator tables.
-inline uint32_t djbHash(StringRef Buffer, uint32_t H = 5381) {
+inline uint32_t djbHash(StringRef Buffer, uint64_t H = 5381) {
   for (unsigned char C : Buffer.bytes())
-    H = (H << 5) + H + C;
-  return H;
+    H = ((H & 0xFFFFFFFF) << 5) + H + C;
+  return static_cast<uint32_t>(H);
 }
 
 /// Computes the Bernstein hash after folding the input according to the Dwarf 5


-------------- next part --------------
A non-text attachment was scrubbed...
Name: D128407.439268.patch
Type: text/x-patch
Size: 665 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20220623/9e9e3026/attachment.bin>


More information about the llvm-commits mailing list