[clang] [llvm] [Support] Add SipHash-based 16/64-bit ptrauth stable hash. (PR #93902)

Ahmed Bougacha via cfe-commits cfe-commits at lists.llvm.org
Fri May 31 13:08:52 PDT 2024


https://github.com/ahmedbougacha updated https://github.com/llvm/llvm-project/pull/93902

>From fc8f76b404b25951dc10ecaaa760b4b4c3d4f858 Mon Sep 17 00:00:00 2001
From: Ahmed Bougacha <ahmed at bougacha.org>
Date: Thu, 30 May 2024 17:07:04 -0700
Subject: [PATCH 1/2] [clang] Add arm64e ABI-defined key assignments to
 ptrauth.h.

These are currently gated by __APPLE__; we can figure out a way to
define these on ELF targets as well, and maybe have them be defined
by clang itself, depending on ABI modes.
---
 clang/lib/Headers/ptrauth.h | 23 +++++++++++++++++++++++
 1 file changed, 23 insertions(+)

diff --git a/clang/lib/Headers/ptrauth.h b/clang/lib/Headers/ptrauth.h
index 56c3c3636c9bc..036665d75a91b 100644
--- a/clang/lib/Headers/ptrauth.h
+++ b/clang/lib/Headers/ptrauth.h
@@ -15,6 +15,29 @@ typedef enum {
   ptrauth_key_asib = 1,
   ptrauth_key_asda = 2,
   ptrauth_key_asdb = 3,
+
+#ifdef __APPLE__
+  /* A process-independent key which can be used to sign code pointers.
+     Signing and authenticating with this key is a no-op in processes
+     which disable ABI pointer authentication. */
+  ptrauth_key_process_independent_code = ptrauth_key_asia,
+
+  /* A process-specific key which can be used to sign code pointers.
+     Signing and authenticating with this key is enforced even in processes
+     which disable ABI pointer authentication. */
+  ptrauth_key_process_dependent_code = ptrauth_key_asib,
+
+  /* A process-independent key which can be used to sign data pointers.
+     Signing and authenticating with this key is a no-op in processes
+     which disable ABI pointer authentication. */
+  ptrauth_key_process_independent_data = ptrauth_key_asda,
+
+  /* A process-specific key which can be used to sign data pointers.
+     Signing and authenticating with this key is a no-op in processes
+     which disable ABI pointer authentication. */
+  ptrauth_key_process_dependent_data = ptrauth_key_asdb,
+#endif /* __APPLE__ */
+
 } ptrauth_key;
 
 /* An integer type of the appropriate size for a discriminator argument. */

>From 42cb73fecf1033aefbe824149f3d8a5352bb2103 Mon Sep 17 00:00:00 2001
From: Ahmed Bougacha <ahmed at bougacha.org>
Date: Thu, 30 May 2024 17:05:31 -0700
Subject: [PATCH 2/2] [Support] Add SipHash-based 16/64-bit ptrauth stable
 hash.

Based on the SipHash reference implementation:
  https://github.com/veorq/SipHash
which has very graciously been licensed under our llvm license
(Apache-2.0 WITH LLVM-exception) by Jean-Philippe Aumasson.

This lightly modifies it to fit into libSupport, and wraps it for the
two main interfaces we're interested in (16/64-bit).

This intentionally doesn't expose a raw interface beyond that to
encourage others to carefully consider their use.

The exact algorithm is the little-endian interpretation of the
non-doubled (i.e. 64-bit) result of applying a SipHash-2-4 using
a specific key value which can be found in the source.

By "stable" we mean that the result of this hash algorithm will the same
across different compiler versions and target platforms.

The 16-bit variant is used extensively for the AArch64 ptrauth ABI,
because AArch64 can efficiently load a 16-bit immediate into the high
bits of a register without disturbing the remainder of the value, which
serves as a nice blend operation.

16 bits is also sufficiently compact to not inflate a loader relocation.
We disallow zero to guarantee a different discriminator from the places
in the ABI that use a constant zero.

Co-Authored-By: John McCall <rjmccall at apple.com>
---
 llvm/include/llvm/Support/SipHash.h    |  47 +++++++
 llvm/lib/Support/CMakeLists.txt        |   1 +
 llvm/lib/Support/SipHash.cpp           | 174 +++++++++++++++++++++++++
 llvm/unittests/Support/CMakeLists.txt  |   1 +
 llvm/unittests/Support/SipHashTest.cpp |  43 ++++++
 5 files changed, 266 insertions(+)
 create mode 100644 llvm/include/llvm/Support/SipHash.h
 create mode 100644 llvm/lib/Support/SipHash.cpp
 create mode 100644 llvm/unittests/Support/SipHashTest.cpp

diff --git a/llvm/include/llvm/Support/SipHash.h b/llvm/include/llvm/Support/SipHash.h
new file mode 100644
index 0000000000000..fcc29c00da185
--- /dev/null
+++ b/llvm/include/llvm/Support/SipHash.h
@@ -0,0 +1,47 @@
+//===--- SipHash.h - An ABI-stable string SipHash ---------------*- C++ -*-===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===----------------------------------------------------------------------===//
+//
+// A family of ABI-stable string hash algorithms based on SipHash, currently
+// used to compute ptrauth discriminators.
+//
+//===----------------------------------------------------------------------===//
+
+#ifndef LLVM_SUPPORT_SIPHASH_H
+#define LLVM_SUPPORT_SIPHASH_H
+
+#include <cstdint>
+
+namespace llvm {
+class StringRef;
+
+/// Compute a stable 64-bit hash of the given string.
+///
+/// The exact algorithm is the little-endian interpretation of the
+/// non-doubled (i.e. 64-bit) result of applying a SipHash-2-4 using
+/// a specific key value which can be found in the source.
+///
+/// By "stable" we mean that the result of this hash algorithm will
+/// the same across different compiler versions and target platforms.
+uint64_t getPointerAuthStableSipHash64(StringRef S);
+
+/// Compute a stable non-zero 16-bit hash of the given string.
+///
+/// This computes the full getPointerAuthStableSipHash64, but additionally
+/// truncates it down to a non-zero 16-bit value.
+///
+/// We use a 16-bit discriminator because ARM64 can efficiently load
+/// a 16-bit immediate into the high bits of a register without disturbing
+/// the remainder of the value, which serves as a nice blend operation.
+/// 16 bits is also sufficiently compact to not inflate a loader relocation.
+/// We disallow zero to guarantee a different discriminator from the places
+/// in the ABI that use a constant zero.
+uint64_t getPointerAuthStableSipHash16(StringRef S);
+
+} // end namespace llvm
+
+#endif
diff --git a/llvm/lib/Support/CMakeLists.txt b/llvm/lib/Support/CMakeLists.txt
index be4badc09efa5..aa37b812791ff 100644
--- a/llvm/lib/Support/CMakeLists.txt
+++ b/llvm/lib/Support/CMakeLists.txt
@@ -222,6 +222,7 @@ add_llvm_component_library(LLVMSupport
   SHA1.cpp
   SHA256.cpp
   Signposts.cpp
+  SipHash.cpp
   SmallPtrSet.cpp
   SmallVector.cpp
   SourceMgr.cpp
diff --git a/llvm/lib/Support/SipHash.cpp b/llvm/lib/Support/SipHash.cpp
new file mode 100644
index 0000000000000..dbd60fb73ebb5
--- /dev/null
+++ b/llvm/lib/Support/SipHash.cpp
@@ -0,0 +1,174 @@
+//===--- StableHash.cpp - An ABI-stable string hash -----------------------===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===----------------------------------------------------------------------===//
+//
+//  This file implements an ABI-stable string hash based on SipHash, used to
+//  compute ptrauth discriminators.
+//
+//===----------------------------------------------------------------------===//
+
+#include "llvm/Support/SipHash.h"
+#include "llvm/ADT/StringExtras.h"
+#include "llvm/ADT/StringRef.h"
+#include "llvm/Support/Debug.h"
+#include <cstdint>
+#include <cstring>
+
+using namespace llvm;
+
+#define DEBUG_TYPE "llvm-siphash"
+
+//  Lightly adapted from the SipHash reference C implementation by
+//  Jean-Philippe Aumasson and Daniel J. Bernstein.
+
+#define SIPHASH_ROTL(x, b) (uint64_t)(((x) << (b)) | ((x) >> (64 - (b))))
+
+#define SIPHASH_U8TO64_LE(p)                                                   \
+  (((uint64_t)((p)[0])) | ((uint64_t)((p)[1]) << 8) |                          \
+   ((uint64_t)((p)[2]) << 16) | ((uint64_t)((p)[3]) << 24) |                   \
+   ((uint64_t)((p)[4]) << 32) | ((uint64_t)((p)[5]) << 40) |                   \
+   ((uint64_t)((p)[6]) << 48) | ((uint64_t)((p)[7]) << 56))
+
+#define SIPHASH_SIPROUND                                                       \
+  do {                                                                         \
+    v0 += v1;                                                                  \
+    v1 = SIPHASH_ROTL(v1, 13);                                                 \
+    v1 ^= v0;                                                                  \
+    v0 = SIPHASH_ROTL(v0, 32);                                                 \
+    v2 += v3;                                                                  \
+    v3 = SIPHASH_ROTL(v3, 16);                                                 \
+    v3 ^= v2;                                                                  \
+    v0 += v3;                                                                  \
+    v3 = SIPHASH_ROTL(v3, 21);                                                 \
+    v3 ^= v0;                                                                  \
+    v2 += v1;                                                                  \
+    v1 = SIPHASH_ROTL(v1, 17);                                                 \
+    v1 ^= v2;                                                                  \
+    v2 = SIPHASH_ROTL(v2, 32);                                                 \
+  } while (0)
+
+template <int cROUNDS, int dROUNDS, class ResultTy>
+static inline ResultTy siphash(const uint8_t *in, uint64_t inlen,
+                               const uint8_t (&k)[16]) {
+  static_assert(sizeof(ResultTy) == 8 || sizeof(ResultTy) == 16,
+                "result type should be uint64_t or uint128_t");
+  uint64_t v0 = 0x736f6d6570736575ULL;
+  uint64_t v1 = 0x646f72616e646f6dULL;
+  uint64_t v2 = 0x6c7967656e657261ULL;
+  uint64_t v3 = 0x7465646279746573ULL;
+  uint64_t b;
+  uint64_t k0 = SIPHASH_U8TO64_LE(k);
+  uint64_t k1 = SIPHASH_U8TO64_LE(k + 8);
+  uint64_t m;
+  int i;
+  const uint8_t *end = in + inlen - (inlen % sizeof(uint64_t));
+  const int left = inlen & 7;
+  b = ((uint64_t)inlen) << 56;
+  v3 ^= k1;
+  v2 ^= k0;
+  v1 ^= k1;
+  v0 ^= k0;
+
+  if (sizeof(ResultTy) == 16) {
+    v1 ^= 0xee;
+  }
+
+  for (; in != end; in += 8) {
+    m = SIPHASH_U8TO64_LE(in);
+    v3 ^= m;
+
+    for (i = 0; i < cROUNDS; ++i)
+      SIPHASH_SIPROUND;
+
+    v0 ^= m;
+  }
+
+  switch (left) {
+  case 7:
+    b |= ((uint64_t)in[6]) << 48;
+    LLVM_FALLTHROUGH;
+  case 6:
+    b |= ((uint64_t)in[5]) << 40;
+    LLVM_FALLTHROUGH;
+  case 5:
+    b |= ((uint64_t)in[4]) << 32;
+    LLVM_FALLTHROUGH;
+  case 4:
+    b |= ((uint64_t)in[3]) << 24;
+    LLVM_FALLTHROUGH;
+  case 3:
+    b |= ((uint64_t)in[2]) << 16;
+    LLVM_FALLTHROUGH;
+  case 2:
+    b |= ((uint64_t)in[1]) << 8;
+    LLVM_FALLTHROUGH;
+  case 1:
+    b |= ((uint64_t)in[0]);
+    break;
+  case 0:
+    break;
+  }
+
+  v3 ^= b;
+
+  for (i = 0; i < cROUNDS; ++i)
+    SIPHASH_SIPROUND;
+
+  v0 ^= b;
+
+  if (sizeof(ResultTy) == 8) {
+    v2 ^= 0xff;
+  } else {
+    v2 ^= 0xee;
+  }
+
+  for (i = 0; i < dROUNDS; ++i)
+    SIPHASH_SIPROUND;
+
+  b = v0 ^ v1 ^ v2 ^ v3;
+
+  // This mess with the result type would be easier with 'if constexpr'.
+
+  uint64_t firstHalf = b;
+  if (sizeof(ResultTy) == 8)
+    return firstHalf;
+
+  v1 ^= 0xdd;
+
+  for (i = 0; i < dROUNDS; ++i)
+    SIPHASH_SIPROUND;
+
+  b = v0 ^ v1 ^ v2 ^ v3;
+  uint64_t secondHalf = b;
+
+  return firstHalf | (ResultTy(secondHalf) << (sizeof(ResultTy) == 8 ? 0 : 64));
+}
+
+//===--- LLVM-specific wrappers around siphash.
+
+/// Compute an ABI-stable 64-bit hash of the given string.
+uint64_t llvm::getPointerAuthStableSipHash64(StringRef Str) {
+  static const uint8_t K[16] = {0xb5, 0xd4, 0xc9, 0xeb, 0x79, 0x10, 0x4a, 0x79,
+                                0x6f, 0xec, 0x8b, 0x1b, 0x42, 0x87, 0x81, 0xd4};
+
+  // The aliasing is fine here because of omnipotent char.
+  auto *Data = reinterpret_cast<const uint8_t *>(Str.data());
+  return siphash<2, 4, uint64_t>(Data, Str.size(), K);
+}
+
+/// Compute an ABI-stable 16-bit hash of the given string.
+uint64_t llvm::getPointerAuthStableSipHash16(StringRef Str) {
+  uint64_t RawHash = getPointerAuthStableSipHash64(Str);
+
+  // Produce a non-zero 16-bit discriminator.
+  uint64_t Discriminator = (RawHash % 0xFFFF) + 1;
+  LLVM_DEBUG(dbgs() << "ptrauth stable hash 16-bit discriminator: "
+                    << utostr(Discriminator) << " (0x"
+                    << utohexstr(Discriminator) << ")"
+                    << " of: " << Str << "\n");
+  return Discriminator;
+}
diff --git a/llvm/unittests/Support/CMakeLists.txt b/llvm/unittests/Support/CMakeLists.txt
index 2718be8450f80..631f2e6bf00df 100644
--- a/llvm/unittests/Support/CMakeLists.txt
+++ b/llvm/unittests/Support/CMakeLists.txt
@@ -75,6 +75,7 @@ add_llvm_unittest(SupportTests
   ScopedPrinterTest.cpp
   SHA256.cpp
   SignalsTest.cpp
+  SipHashTest.cpp
   SourceMgrTest.cpp
   SpecialCaseListTest.cpp
   SuffixTreeTest.cpp
diff --git a/llvm/unittests/Support/SipHashTest.cpp b/llvm/unittests/Support/SipHashTest.cpp
new file mode 100644
index 0000000000000..1a8143d9c9375
--- /dev/null
+++ b/llvm/unittests/Support/SipHashTest.cpp
@@ -0,0 +1,43 @@
+//===- llvm/unittest/Support/SipHashTest.cpp ------------------------------===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===----------------------------------------------------------------------===//
+
+#include "llvm/Support/SipHash.h"
+#include "llvm/ADT/StringRef.h"
+#include "llvm/ADT/StringExtras.h"
+#include "llvm/Support/raw_ostream.h"
+#include "gtest/gtest.h"
+
+using namespace llvm;
+
+namespace {
+
+TEST(SipHashTest, PointerAuthSipHash) {
+  // Test some basic cases, for 16 bit and 64 bit results.
+  EXPECT_EQ(0xE793U, getPointerAuthStableSipHash16(""));
+  EXPECT_EQ(0xF468U, getPointerAuthStableSipHash16("strlen"));
+  EXPECT_EQ(0x2D15U, getPointerAuthStableSipHash16("_ZN1 ind; f"));
+
+  EXPECT_EQ(0xB2BB69BB0A2AC0F1U, getPointerAuthStableSipHash64(""));
+  EXPECT_EQ(0x9304ABFF427B72E8U, getPointerAuthStableSipHash64("strlen"));
+  EXPECT_EQ(0x55F45179A08AE51BU, getPointerAuthStableSipHash64("_ZN1 ind; f"));
+
+  // Test some known strings that are already enshrined in the ABI.
+  EXPECT_EQ(0x6AE1U, getPointerAuthStableSipHash16("isa"));
+  EXPECT_EQ(0xB5ABU, getPointerAuthStableSipHash16("objc_class:superclass"));
+  EXPECT_EQ(0xC0BBU, getPointerAuthStableSipHash16("block_descriptor"));
+  EXPECT_EQ(0xC310U, getPointerAuthStableSipHash16("method_list_t"));
+
+  // Test the limits that apply to 16 bit results but don't to 64 bit results.
+  EXPECT_EQ(1U,                  getPointerAuthStableSipHash16("_Zptrkvttf"));
+  EXPECT_EQ(0x314FD87E0611F020U, getPointerAuthStableSipHash64("_Zptrkvttf"));
+
+  EXPECT_EQ(0xFFFFU,             getPointerAuthStableSipHash16("_Zaflhllod"));
+  EXPECT_EQ(0x1292F635FB3DFBF8U, getPointerAuthStableSipHash64("_Zaflhllod"));
+}
+
+} // end anonymous namespace



More information about the cfe-commits mailing list