[clang] 7c298c1 - [PGO] Fix computation of function Hash

via cfe-commits cfe-commits at lists.llvm.org
Mon May 25 08:17:46 PDT 2020


Author: serge-sans-paille
Date: 2020-05-25T17:17:29+02:00
New Revision: 7c298c104bfe725d4315926a656263e8a5ac3054

URL: https://github.com/llvm/llvm-project/commit/7c298c104bfe725d4315926a656263e8a5ac3054
DIFF: https://github.com/llvm/llvm-project/commit/7c298c104bfe725d4315926a656263e8a5ac3054.diff

LOG: [PGO] Fix computation of function Hash

Previous implementation was incorrectly passing an uint64_t, that got converted
to an uint8_t, to finalize the hash computation. This led to different functions
having the same hash if they only differ by the remaining statements, which is
incorrect.

Added a new test case that trivially tests that a small function change is
reflected in the hash value.

Not that as this patch fixes the hash computation, it invalidates all hashes
computed before that patch applies, which could be an issue for large build
system that pre-compute the profile data and let client download them as part of
the build process.

Differential Revision: https://reviews.llvm.org/D79961

Added: 
    clang/test/Profile/c-collision.c

Modified: 
    clang/lib/CodeGen/CodeGenPGO.cpp

Removed: 
    


################################################################################
diff  --git a/clang/lib/CodeGen/CodeGenPGO.cpp b/clang/lib/CodeGen/CodeGenPGO.cpp
index 3c91a04d5464..98827bc3eec5 100644
--- a/clang/lib/CodeGen/CodeGenPGO.cpp
+++ b/clang/lib/CodeGen/CodeGenPGO.cpp
@@ -747,13 +747,15 @@ uint64_t PGOHash::finalize() {
     return Working;
 
   // Check for remaining work in Working.
-  if (Working)
-    MD5.update(Working);
+  if (Working) {
+    using namespace llvm::support;
+    uint64_t Swapped = endian::byte_swap<uint64_t, little>(Working);
+    MD5.update(llvm::makeArrayRef((uint8_t *)&Swapped, sizeof(Swapped)));
+  }
 
   // Finalize the MD5 and return the hash.
   llvm::MD5::MD5Result Result;
   MD5.final(Result);
-  using namespace llvm::support;
   return Result.low();
 }
 

diff  --git a/clang/test/Profile/c-collision.c b/clang/test/Profile/c-collision.c
new file mode 100644
index 000000000000..fabecd752b4e
--- /dev/null
+++ b/clang/test/Profile/c-collision.c
@@ -0,0 +1,22 @@
+// Test that a slight change in the code leads to a 
diff erent hash.
+// RUN: %clang_cc1 -UEXTRA -triple x86_64-unknown-linux-gnu -main-file-name c-collision.c %s -o - -emit-llvm -fprofile-instrument=clang | FileCheck %s --check-prefix=CHECK-NOEXTRA
+// RUN: %clang_cc1 -DEXTRA -triple x86_64-unknown-linux-gnu -main-file-name c-collision.c %s -o - -emit-llvm -fprofile-instrument=clang | FileCheck %s --check-prefix=CHECK-EXTRA
+
+// CHECK-NOEXTRA: @__profd_foo = private global { {{.*}} } { i64 6699318081062747564, i64 7156072912471487002,
+// CHECK-EXTRA:   @__profd_foo = private global { {{.*}} } { i64 6699318081062747564, i64 -4383447408116050035,
+
+extern int bar;
+void foo() {
+  if (bar) {
+  }
+  if (bar) {
+  }
+  if (bar) {
+    if (bar) {
+#ifdef EXTRA
+      if (bar) {
+      }
+#endif
+    }
+  }
+}


        


More information about the cfe-commits mailing list