[llvm-branch-commits] [llvm] [Support] Add SipHash-based 16/64-bit ptrauth stable hash. (PR #93902)
Ahmed Bougacha via llvm-branch-commits
llvm-branch-commits at lists.llvm.org
Thu May 30 17:21:30 PDT 2024
https://github.com/ahmedbougacha created https://github.com/llvm/llvm-project/pull/93902
Based on the SipHash reference implementation:
https://github.com/veorq/SipHash
which has very graciously been licensed under our llvm license (Apache-2.0 WITH LLVM-exception) by Jean-Philippe Aumasson.
This lightly modifies it to fit into libSupport, and wraps it for the two main interfaces we're interested in (16/64-bit).
This intentionally doesn't expose a raw interface beyond that to encourage others to carefully consider their use.
Beyond that, I'm ambivalent about naming this ptrauth, siphash, stablehash, or which combination of the three (which is what this patch does for the symbols, but keeping SipHash for the file, for other potential usage.)
The exact algorithm is the little-endian interpretation of the non-doubled (i.e. 64-bit) result of applying a SipHash-2-4 using a specific key value which can be found in the source.
By "stable" we mean that the result of this hash algorithm will the same across different compiler versions and target platforms.
The 16-bit variant is used extensively for the AArch64 ptrauth ABI, because AArch64 can efficiently load a 16-bit immediate into the high bits of a register without disturbing the remainder of the value, which serves as a nice blend operation.
16 bits is also sufficiently compact to not inflate a loader relocation. We disallow zero to guarantee a different discriminator from the places in the ABI that use a constant zero.
>From 42cb73fecf1033aefbe824149f3d8a5352bb2103 Mon Sep 17 00:00:00 2001
From: Ahmed Bougacha <ahmed at bougacha.org>
Date: Thu, 30 May 2024 17:05:31 -0700
Subject: [PATCH] [Support] Add SipHash-based 16/64-bit ptrauth stable hash.
Based on the SipHash reference implementation:
https://github.com/veorq/SipHash
which has very graciously been licensed under our llvm license
(Apache-2.0 WITH LLVM-exception) by Jean-Philippe Aumasson.
This lightly modifies it to fit into libSupport, and wraps it for the
two main interfaces we're interested in (16/64-bit).
This intentionally doesn't expose a raw interface beyond that to
encourage others to carefully consider their use.
The exact algorithm is the little-endian interpretation of the
non-doubled (i.e. 64-bit) result of applying a SipHash-2-4 using
a specific key value which can be found in the source.
By "stable" we mean that the result of this hash algorithm will the same
across different compiler versions and target platforms.
The 16-bit variant is used extensively for the AArch64 ptrauth ABI,
because AArch64 can efficiently load a 16-bit immediate into the high
bits of a register without disturbing the remainder of the value, which
serves as a nice blend operation.
16 bits is also sufficiently compact to not inflate a loader relocation.
We disallow zero to guarantee a different discriminator from the places
in the ABI that use a constant zero.
Co-Authored-By: John McCall <rjmccall at apple.com>
---
llvm/include/llvm/Support/SipHash.h | 47 +++++++
llvm/lib/Support/CMakeLists.txt | 1 +
llvm/lib/Support/SipHash.cpp | 174 +++++++++++++++++++++++++
llvm/unittests/Support/CMakeLists.txt | 1 +
llvm/unittests/Support/SipHashTest.cpp | 43 ++++++
5 files changed, 266 insertions(+)
create mode 100644 llvm/include/llvm/Support/SipHash.h
create mode 100644 llvm/lib/Support/SipHash.cpp
create mode 100644 llvm/unittests/Support/SipHashTest.cpp
diff --git a/llvm/include/llvm/Support/SipHash.h b/llvm/include/llvm/Support/SipHash.h
new file mode 100644
index 0000000000000..fcc29c00da185
--- /dev/null
+++ b/llvm/include/llvm/Support/SipHash.h
@@ -0,0 +1,47 @@
+//===--- SipHash.h - An ABI-stable string SipHash ---------------*- C++ -*-===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===----------------------------------------------------------------------===//
+//
+// A family of ABI-stable string hash algorithms based on SipHash, currently
+// used to compute ptrauth discriminators.
+//
+//===----------------------------------------------------------------------===//
+
+#ifndef LLVM_SUPPORT_SIPHASH_H
+#define LLVM_SUPPORT_SIPHASH_H
+
+#include <cstdint>
+
+namespace llvm {
+class StringRef;
+
+/// Compute a stable 64-bit hash of the given string.
+///
+/// The exact algorithm is the little-endian interpretation of the
+/// non-doubled (i.e. 64-bit) result of applying a SipHash-2-4 using
+/// a specific key value which can be found in the source.
+///
+/// By "stable" we mean that the result of this hash algorithm will
+/// the same across different compiler versions and target platforms.
+uint64_t getPointerAuthStableSipHash64(StringRef S);
+
+/// Compute a stable non-zero 16-bit hash of the given string.
+///
+/// This computes the full getPointerAuthStableSipHash64, but additionally
+/// truncates it down to a non-zero 16-bit value.
+///
+/// We use a 16-bit discriminator because ARM64 can efficiently load
+/// a 16-bit immediate into the high bits of a register without disturbing
+/// the remainder of the value, which serves as a nice blend operation.
+/// 16 bits is also sufficiently compact to not inflate a loader relocation.
+/// We disallow zero to guarantee a different discriminator from the places
+/// in the ABI that use a constant zero.
+uint64_t getPointerAuthStableSipHash16(StringRef S);
+
+} // end namespace llvm
+
+#endif
diff --git a/llvm/lib/Support/CMakeLists.txt b/llvm/lib/Support/CMakeLists.txt
index be4badc09efa5..aa37b812791ff 100644
--- a/llvm/lib/Support/CMakeLists.txt
+++ b/llvm/lib/Support/CMakeLists.txt
@@ -222,6 +222,7 @@ add_llvm_component_library(LLVMSupport
SHA1.cpp
SHA256.cpp
Signposts.cpp
+ SipHash.cpp
SmallPtrSet.cpp
SmallVector.cpp
SourceMgr.cpp
diff --git a/llvm/lib/Support/SipHash.cpp b/llvm/lib/Support/SipHash.cpp
new file mode 100644
index 0000000000000..dbd60fb73ebb5
--- /dev/null
+++ b/llvm/lib/Support/SipHash.cpp
@@ -0,0 +1,174 @@
+//===--- StableHash.cpp - An ABI-stable string hash -----------------------===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===----------------------------------------------------------------------===//
+//
+// This file implements an ABI-stable string hash based on SipHash, used to
+// compute ptrauth discriminators.
+//
+//===----------------------------------------------------------------------===//
+
+#include "llvm/Support/SipHash.h"
+#include "llvm/ADT/StringExtras.h"
+#include "llvm/ADT/StringRef.h"
+#include "llvm/Support/Debug.h"
+#include <cstdint>
+#include <cstring>
+
+using namespace llvm;
+
+#define DEBUG_TYPE "llvm-siphash"
+
+// Lightly adapted from the SipHash reference C implementation by
+// Jean-Philippe Aumasson and Daniel J. Bernstein.
+
+#define SIPHASH_ROTL(x, b) (uint64_t)(((x) << (b)) | ((x) >> (64 - (b))))
+
+#define SIPHASH_U8TO64_LE(p) \
+ (((uint64_t)((p)[0])) | ((uint64_t)((p)[1]) << 8) | \
+ ((uint64_t)((p)[2]) << 16) | ((uint64_t)((p)[3]) << 24) | \
+ ((uint64_t)((p)[4]) << 32) | ((uint64_t)((p)[5]) << 40) | \
+ ((uint64_t)((p)[6]) << 48) | ((uint64_t)((p)[7]) << 56))
+
+#define SIPHASH_SIPROUND \
+ do { \
+ v0 += v1; \
+ v1 = SIPHASH_ROTL(v1, 13); \
+ v1 ^= v0; \
+ v0 = SIPHASH_ROTL(v0, 32); \
+ v2 += v3; \
+ v3 = SIPHASH_ROTL(v3, 16); \
+ v3 ^= v2; \
+ v0 += v3; \
+ v3 = SIPHASH_ROTL(v3, 21); \
+ v3 ^= v0; \
+ v2 += v1; \
+ v1 = SIPHASH_ROTL(v1, 17); \
+ v1 ^= v2; \
+ v2 = SIPHASH_ROTL(v2, 32); \
+ } while (0)
+
+template <int cROUNDS, int dROUNDS, class ResultTy>
+static inline ResultTy siphash(const uint8_t *in, uint64_t inlen,
+ const uint8_t (&k)[16]) {
+ static_assert(sizeof(ResultTy) == 8 || sizeof(ResultTy) == 16,
+ "result type should be uint64_t or uint128_t");
+ uint64_t v0 = 0x736f6d6570736575ULL;
+ uint64_t v1 = 0x646f72616e646f6dULL;
+ uint64_t v2 = 0x6c7967656e657261ULL;
+ uint64_t v3 = 0x7465646279746573ULL;
+ uint64_t b;
+ uint64_t k0 = SIPHASH_U8TO64_LE(k);
+ uint64_t k1 = SIPHASH_U8TO64_LE(k + 8);
+ uint64_t m;
+ int i;
+ const uint8_t *end = in + inlen - (inlen % sizeof(uint64_t));
+ const int left = inlen & 7;
+ b = ((uint64_t)inlen) << 56;
+ v3 ^= k1;
+ v2 ^= k0;
+ v1 ^= k1;
+ v0 ^= k0;
+
+ if (sizeof(ResultTy) == 16) {
+ v1 ^= 0xee;
+ }
+
+ for (; in != end; in += 8) {
+ m = SIPHASH_U8TO64_LE(in);
+ v3 ^= m;
+
+ for (i = 0; i < cROUNDS; ++i)
+ SIPHASH_SIPROUND;
+
+ v0 ^= m;
+ }
+
+ switch (left) {
+ case 7:
+ b |= ((uint64_t)in[6]) << 48;
+ LLVM_FALLTHROUGH;
+ case 6:
+ b |= ((uint64_t)in[5]) << 40;
+ LLVM_FALLTHROUGH;
+ case 5:
+ b |= ((uint64_t)in[4]) << 32;
+ LLVM_FALLTHROUGH;
+ case 4:
+ b |= ((uint64_t)in[3]) << 24;
+ LLVM_FALLTHROUGH;
+ case 3:
+ b |= ((uint64_t)in[2]) << 16;
+ LLVM_FALLTHROUGH;
+ case 2:
+ b |= ((uint64_t)in[1]) << 8;
+ LLVM_FALLTHROUGH;
+ case 1:
+ b |= ((uint64_t)in[0]);
+ break;
+ case 0:
+ break;
+ }
+
+ v3 ^= b;
+
+ for (i = 0; i < cROUNDS; ++i)
+ SIPHASH_SIPROUND;
+
+ v0 ^= b;
+
+ if (sizeof(ResultTy) == 8) {
+ v2 ^= 0xff;
+ } else {
+ v2 ^= 0xee;
+ }
+
+ for (i = 0; i < dROUNDS; ++i)
+ SIPHASH_SIPROUND;
+
+ b = v0 ^ v1 ^ v2 ^ v3;
+
+ // This mess with the result type would be easier with 'if constexpr'.
+
+ uint64_t firstHalf = b;
+ if (sizeof(ResultTy) == 8)
+ return firstHalf;
+
+ v1 ^= 0xdd;
+
+ for (i = 0; i < dROUNDS; ++i)
+ SIPHASH_SIPROUND;
+
+ b = v0 ^ v1 ^ v2 ^ v3;
+ uint64_t secondHalf = b;
+
+ return firstHalf | (ResultTy(secondHalf) << (sizeof(ResultTy) == 8 ? 0 : 64));
+}
+
+//===--- LLVM-specific wrappers around siphash.
+
+/// Compute an ABI-stable 64-bit hash of the given string.
+uint64_t llvm::getPointerAuthStableSipHash64(StringRef Str) {
+ static const uint8_t K[16] = {0xb5, 0xd4, 0xc9, 0xeb, 0x79, 0x10, 0x4a, 0x79,
+ 0x6f, 0xec, 0x8b, 0x1b, 0x42, 0x87, 0x81, 0xd4};
+
+ // The aliasing is fine here because of omnipotent char.
+ auto *Data = reinterpret_cast<const uint8_t *>(Str.data());
+ return siphash<2, 4, uint64_t>(Data, Str.size(), K);
+}
+
+/// Compute an ABI-stable 16-bit hash of the given string.
+uint64_t llvm::getPointerAuthStableSipHash16(StringRef Str) {
+ uint64_t RawHash = getPointerAuthStableSipHash64(Str);
+
+ // Produce a non-zero 16-bit discriminator.
+ uint64_t Discriminator = (RawHash % 0xFFFF) + 1;
+ LLVM_DEBUG(dbgs() << "ptrauth stable hash 16-bit discriminator: "
+ << utostr(Discriminator) << " (0x"
+ << utohexstr(Discriminator) << ")"
+ << " of: " << Str << "\n");
+ return Discriminator;
+}
diff --git a/llvm/unittests/Support/CMakeLists.txt b/llvm/unittests/Support/CMakeLists.txt
index 2718be8450f80..631f2e6bf00df 100644
--- a/llvm/unittests/Support/CMakeLists.txt
+++ b/llvm/unittests/Support/CMakeLists.txt
@@ -75,6 +75,7 @@ add_llvm_unittest(SupportTests
ScopedPrinterTest.cpp
SHA256.cpp
SignalsTest.cpp
+ SipHashTest.cpp
SourceMgrTest.cpp
SpecialCaseListTest.cpp
SuffixTreeTest.cpp
diff --git a/llvm/unittests/Support/SipHashTest.cpp b/llvm/unittests/Support/SipHashTest.cpp
new file mode 100644
index 0000000000000..1a8143d9c9375
--- /dev/null
+++ b/llvm/unittests/Support/SipHashTest.cpp
@@ -0,0 +1,43 @@
+//===- llvm/unittest/Support/SipHashTest.cpp ------------------------------===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===----------------------------------------------------------------------===//
+
+#include "llvm/Support/SipHash.h"
+#include "llvm/ADT/StringRef.h"
+#include "llvm/ADT/StringExtras.h"
+#include "llvm/Support/raw_ostream.h"
+#include "gtest/gtest.h"
+
+using namespace llvm;
+
+namespace {
+
+TEST(SipHashTest, PointerAuthSipHash) {
+ // Test some basic cases, for 16 bit and 64 bit results.
+ EXPECT_EQ(0xE793U, getPointerAuthStableSipHash16(""));
+ EXPECT_EQ(0xF468U, getPointerAuthStableSipHash16("strlen"));
+ EXPECT_EQ(0x2D15U, getPointerAuthStableSipHash16("_ZN1 ind; f"));
+
+ EXPECT_EQ(0xB2BB69BB0A2AC0F1U, getPointerAuthStableSipHash64(""));
+ EXPECT_EQ(0x9304ABFF427B72E8U, getPointerAuthStableSipHash64("strlen"));
+ EXPECT_EQ(0x55F45179A08AE51BU, getPointerAuthStableSipHash64("_ZN1 ind; f"));
+
+ // Test some known strings that are already enshrined in the ABI.
+ EXPECT_EQ(0x6AE1U, getPointerAuthStableSipHash16("isa"));
+ EXPECT_EQ(0xB5ABU, getPointerAuthStableSipHash16("objc_class:superclass"));
+ EXPECT_EQ(0xC0BBU, getPointerAuthStableSipHash16("block_descriptor"));
+ EXPECT_EQ(0xC310U, getPointerAuthStableSipHash16("method_list_t"));
+
+ // Test the limits that apply to 16 bit results but don't to 64 bit results.
+ EXPECT_EQ(1U, getPointerAuthStableSipHash16("_Zptrkvttf"));
+ EXPECT_EQ(0x314FD87E0611F020U, getPointerAuthStableSipHash64("_Zptrkvttf"));
+
+ EXPECT_EQ(0xFFFFU, getPointerAuthStableSipHash16("_Zaflhllod"));
+ EXPECT_EQ(0x1292F635FB3DFBF8U, getPointerAuthStableSipHash64("_Zaflhllod"));
+}
+
+} // end anonymous namespace
More information about the llvm-branch-commits
mailing list