[llvm-branch-commits] [llvm] [Support] Integrate SipHash.cpp into libSupport. (PR #94394)
via llvm-branch-commits
llvm-branch-commits at lists.llvm.org
Tue Jun 4 15:11:58 PDT 2024
llvmbot wrote:
<!--LLVM PR SUMMARY COMMENT-->
@llvm/pr-subscribers-llvm-support
Author: Ahmed Bougacha (ahmedbougacha)
<details>
<summary>Changes</summary>
Start building it as part of the library, with some minor
tweaks compared to the reference implementation:
- clang-format to match libSupport
- remove tracing support
- add file header
- templatize cROUNDS/dROUNDS, as well as 8B/16B result type
- replace assert with static_assert
- return the result directly, as uint64_t/uint128_t
- remove big-endian support
- use LLVM_FALLTHROUGH
- remove tracing support
The siphash function itself isn't used yet, and will be in a follow-up
commit.
---
Full diff: https://github.com/llvm/llvm-project/pull/94394.diff
3 Files Affected:
- (modified) llvm/lib/Support/CMakeLists.txt (+1-3)
- (removed) llvm/lib/Support/README.md.SipHash (-126)
- (modified) llvm/lib/Support/SipHash.cpp (+126-176)
``````````diff
diff --git a/llvm/lib/Support/CMakeLists.txt b/llvm/lib/Support/CMakeLists.txt
index 7cc01a5399911..f5f8447d48d01 100644
--- a/llvm/lib/Support/CMakeLists.txt
+++ b/llvm/lib/Support/CMakeLists.txt
@@ -127,9 +127,6 @@ endif()
add_subdirectory(BLAKE3)
-# Temporarily ignore SipHash.cpp before we fully integrate it into LLVMSupport.
-set(LLVM_OPTIONAL_SOURCES SipHash.cpp)
-
add_llvm_component_library(LLVMSupport
ABIBreak.cpp
AMDGPUMetadata.cpp
@@ -226,6 +223,7 @@ add_llvm_component_library(LLVMSupport
SHA1.cpp
SHA256.cpp
Signposts.cpp
+ SipHash.cpp
SmallPtrSet.cpp
SmallVector.cpp
SourceMgr.cpp
diff --git a/llvm/lib/Support/README.md.SipHash b/llvm/lib/Support/README.md.SipHash
deleted file mode 100644
index 4de3cd1854681..0000000000000
--- a/llvm/lib/Support/README.md.SipHash
+++ /dev/null
@@ -1,126 +0,0 @@
-# SipHash
-
-[![License:
-CC0-1.0](https://licensebuttons.net/l/zero/1.0/80x15.png)](http://creativecommons.org/publicdomain/zero/1.0/)
-
-[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)
-
-
-SipHash is a family of pseudorandom functions (PRFs) optimized for speed on short messages.
-This is the reference C code of SipHash: portable, simple, optimized for clarity and debugging.
-
-SipHash was designed in 2012 by [Jean-Philippe Aumasson](https://aumasson.jp)
-and [Daniel J. Bernstein](https://cr.yp.to) as a defense against [hash-flooding
-DoS attacks](https://aumasson.jp/siphash/siphashdos_29c3_slides.pdf).
-
-SipHash is:
-
-* *Simpler and faster* on short messages than previous cryptographic
-algorithms, such as MACs based on universal hashing.
-
-* *Competitive in performance* with insecure non-cryptographic algorithms, such as [fhhash](https://github.com/cbreeden/fxhash).
-
-* *Cryptographically secure*, with no sign of weakness despite multiple [cryptanalysis](https://eprint.iacr.org/2019/865) [projects](https://eprint.iacr.org/2019/865) by leading cryptographers.
-
-* *Battle-tested*, with successful integration in OSs (Linux kernel, OpenBSD,
-FreeBSD, FreeRTOS), languages (Perl, Python, Ruby, etc.), libraries (OpenSSL libcrypto,
-Sodium, etc.) and applications (Wireguard, Redis, etc.).
-
-As a secure pseudorandom function (a.k.a. keyed hash function), SipHash can also be used as a secure message authentication code (MAC).
-But SipHash is *not a hash* in the sense of general-purpose key-less hash function such as BLAKE3 or SHA-3.
-SipHash should therefore always be used with a secret key in order to be secure.
-
-
-## Variants
-
-The default SipHash is *SipHash-2-4*: it takes a 128-bit key, does 2 compression
-rounds, 4 finalization rounds, and returns a 64-bit tag.
-
-Variants can use a different number of rounds. For example, we proposed *SipHash-4-8* as a conservative version.
-
-The following versions are not described in the paper but were designed and analyzed to fulfill applications' needs:
-
-* *SipHash-128* returns a 128-bit tag instead of 64-bit. Versions with specified number of rounds are SipHash-2-4-128, SipHash4-8-128, and so on.
-
-* *HalfSipHash* works with 32-bit words instead of 64-bit, takes a 64-bit key,
-and returns 32-bit or 64-bit tags. For example, HalfSipHash-2-4-32 has 2
-compression rounds, 4 finalization rounds, and returns a 32-bit tag.
-
-
-## Security
-
-(Half)SipHash-*c*-*d* with *c* ≥ 2 and *d* ≥ 4 is expected to provide the maximum PRF
-security for any function with the same key and output size.
-
-The standard PRF security goal allow the attacker access to the output of SipHash on messages chosen adaptively by the attacker.
-
-Security is limited by the key size (128 bits for SipHash), such that
-attackers searching 2<sup>*s*</sup> keys have chance 2<sup>*s*−128</sup> of finding
-the SipHash key.
-Security is also limited by the output size. In particular, when
-SipHash is used as a MAC, an attacker who blindly tries 2<sup>*s*</sup> tags will
-succeed with probability 2<sup>*s*-*t*</sup>, if *t* is that tag's bit size.
-
-
-## Research
-
-* [Research paper](https://www.aumasson.jp/siphash/siphash.pdf) "SipHash: a fast short-input PRF" (accepted at INDOCRYPT 2012)
-* [Slides](https://cr.yp.to/talks/2012.12.12/slides.pdf) of the presentation of SipHash at INDOCRYPT 2012 (Bernstein)
-* [Slides](https://www.aumasson.jp/siphash/siphash_slides.pdf) of the presentation of SipHash at the DIAC workshop (Aumasson)
-
-
-## Usage
-
-Running
-
-```sh
- make
-```
-
-will build tests for
-
-* SipHash-2-4-64
-* SipHash-2-4-128
-* HalfSipHash-2-4-32
-* HalfSipHash-2-4-64
-
-
-```C
- ./test
-```
-
-verifies 64 test vectors, and
-
-```C
- ./debug
-```
-
-does the same and prints intermediate values.
-
-The code can be adapted to implement SipHash-*c*-*d*, the version of SipHash
-with *c* compression rounds and *d* finalization rounds, by defining `cROUNDS`
-or `dROUNDS` when compiling. This can be done with `-D` command line arguments
-to many compilers such as below.
-
-```sh
-gcc -Wall --std=c99 -DcROUNDS=2 -DdROUNDS=4 siphash.c halfsiphash.c test.c -o test
-```
-
-The `makefile` also takes *c* and *d* rounds values as parameters.
-
-```sh
-make cROUNDS=2 dROUNDS=4
-```
-
-Obviously, if the number of rounds is modified then the test vectors
-won't verify.
-
-## Intellectual property
-
-This code is copyright (c) 2014-2023 Jean-Philippe Aumasson, Daniel J.
-Bernstein. It is multi-licensed under
-
-* [CC0](./LICENCE_CC0)
-* [MIT](./LICENSE_MIT).
-* [Apache 2.0 with LLVM exceptions](./LICENSE_A2LLVM).
-
diff --git a/llvm/lib/Support/SipHash.cpp b/llvm/lib/Support/SipHash.cpp
index c6d16e205521d..ef882ae4d8745 100644
--- a/llvm/lib/Support/SipHash.cpp
+++ b/llvm/lib/Support/SipHash.cpp
@@ -1,185 +1,135 @@
-/*
- SipHash reference C implementation
-
- Copyright (c) 2012-2022 Jean-Philippe Aumasson
- <jeanphilippe.aumasson at gmail.com>
- Copyright (c) 2012-2014 Daniel J. Bernstein <djb at cr.yp.to>
-
- To the extent possible under law, the author(s) have dedicated all copyright
- and related and neighboring rights to this software to the public domain
- worldwide. This software is distributed without any warranty.
-
- You should have received a copy of the CC0 Public Domain Dedication along
- with
- this software. If not, see
- <http://creativecommons.org/publicdomain/zero/1.0/>.
- */
-
-#include "siphash.h"
-#include <assert.h>
-#include <stddef.h>
-#include <stdint.h>
-
-/* default: SipHash-2-4 */
-#ifndef cROUNDS
-#define cROUNDS 2
-#endif
-#ifndef dROUNDS
-#define dROUNDS 4
-#endif
+//===--- SipHash.cpp - An ABI-stable string hash --------------------------===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===----------------------------------------------------------------------===//
-#define ROTL(x, b) (uint64_t)(((x) << (b)) | ((x) >> (64 - (b))))
+#include "llvm/Support/Compiler.h"
+#include <cstdint>
-#define U32TO8_LE(p, v) \
- (p)[0] = (uint8_t)((v)); \
- (p)[1] = (uint8_t)((v) >> 8); \
- (p)[2] = (uint8_t)((v) >> 16); \
- (p)[3] = (uint8_t)((v) >> 24);
+// Lightly adapted from the SipHash reference C implementation by
+// Jean-Philippe Aumasson and Daniel J. Bernstein.
-#define U64TO8_LE(p, v) \
- U32TO8_LE((p), (uint32_t)((v))); \
- U32TO8_LE((p) + 4, (uint32_t)((v) >> 32));
+#define ROTL(x, b) (uint64_t)(((x) << (b)) | ((x) >> (64 - (b))))
#define U8TO64_LE(p) \
- (((uint64_t)((p)[0])) | ((uint64_t)((p)[1]) << 8) | \
- ((uint64_t)((p)[2]) << 16) | ((uint64_t)((p)[3]) << 24) | \
- ((uint64_t)((p)[4]) << 32) | ((uint64_t)((p)[5]) << 40) | \
- ((uint64_t)((p)[6]) << 48) | ((uint64_t)((p)[7]) << 56))
+ (((uint64_t)((p)[0])) | ((uint64_t)((p)[1]) << 8) | \
+ ((uint64_t)((p)[2]) << 16) | ((uint64_t)((p)[3]) << 24) | \
+ ((uint64_t)((p)[4]) << 32) | ((uint64_t)((p)[5]) << 40) | \
+ ((uint64_t)((p)[6]) << 48) | ((uint64_t)((p)[7]) << 56))
#define SIPROUND \
- do { \
- v0 += v1; \
- v1 = ROTL(v1, 13); \
- v1 ^= v0; \
- v0 = ROTL(v0, 32); \
- v2 += v3; \
- v3 = ROTL(v3, 16); \
- v3 ^= v2; \
- v0 += v3; \
- v3 = ROTL(v3, 21); \
- v3 ^= v0; \
- v2 += v1; \
- v1 = ROTL(v1, 17); \
- v1 ^= v2; \
- v2 = ROTL(v2, 32); \
- } while (0)
-
-#ifdef DEBUG_SIPHASH
-#include <stdio.h>
-
-#define TRACE \
- do { \
- printf("(%3zu) v0 %016" PRIx64 "\n", inlen, v0); \
- printf("(%3zu) v1 %016" PRIx64 "\n", inlen, v1); \
- printf("(%3zu) v2 %016" PRIx64 "\n", inlen, v2); \
- printf("(%3zu) v3 %016" PRIx64 "\n", inlen, v3); \
- } while (0)
-#else
-#define TRACE
-#endif
-
-/*
- Computes a SipHash value
- *in: pointer to input data (read-only)
- inlen: input data length in bytes (any size_t value)
- *k: pointer to the key data (read-only), must be 16 bytes
- *out: pointer to output data (write-only), outlen bytes must be allocated
- outlen: length of the output in bytes, must be 8 or 16
-*/
-int siphash(const void *in, const size_t inlen, const void *k, uint8_t *out,
- const size_t outlen) {
-
- const unsigned char *ni = (const unsigned char *)in;
- const unsigned char *kk = (const unsigned char *)k;
-
- assert((outlen == 8) || (outlen == 16));
- uint64_t v0 = UINT64_C(0x736f6d6570736575);
- uint64_t v1 = UINT64_C(0x646f72616e646f6d);
- uint64_t v2 = UINT64_C(0x6c7967656e657261);
- uint64_t v3 = UINT64_C(0x7465646279746573);
- uint64_t k0 = U8TO64_LE(kk);
- uint64_t k1 = U8TO64_LE(kk + 8);
- uint64_t m;
- int i;
- const unsigned char *end = ni + inlen - (inlen % sizeof(uint64_t));
- const int left = inlen & 7;
- uint64_t b = ((uint64_t)inlen) << 56;
- v3 ^= k1;
- v2 ^= k0;
- v1 ^= k1;
- v0 ^= k0;
-
- if (outlen == 16)
- v1 ^= 0xee;
-
- for (; ni != end; ni += 8) {
- m = U8TO64_LE(ni);
- v3 ^= m;
-
- TRACE;
- for (i = 0; i < cROUNDS; ++i)
- SIPROUND;
-
- v0 ^= m;
- }
-
- switch (left) {
- case 7:
- b |= ((uint64_t)ni[6]) << 48;
- /* FALLTHRU */
- case 6:
- b |= ((uint64_t)ni[5]) << 40;
- /* FALLTHRU */
- case 5:
- b |= ((uint64_t)ni[4]) << 32;
- /* FALLTHRU */
- case 4:
- b |= ((uint64_t)ni[3]) << 24;
- /* FALLTHRU */
- case 3:
- b |= ((uint64_t)ni[2]) << 16;
- /* FALLTHRU */
- case 2:
- b |= ((uint64_t)ni[1]) << 8;
- /* FALLTHRU */
- case 1:
- b |= ((uint64_t)ni[0]);
- break;
- case 0:
- break;
- }
-
- v3 ^= b;
-
- TRACE;
- for (i = 0; i < cROUNDS; ++i)
- SIPROUND;
-
- v0 ^= b;
-
- if (outlen == 16)
- v2 ^= 0xee;
- else
- v2 ^= 0xff;
-
- TRACE;
- for (i = 0; i < dROUNDS; ++i)
- SIPROUND;
+ do { \
+ v0 += v1; \
+ v1 = ROTL(v1, 13); \
+ v1 ^= v0; \
+ v0 = ROTL(v0, 32); \
+ v2 += v3; \
+ v3 = ROTL(v3, 16); \
+ v3 ^= v2; \
+ v0 += v3; \
+ v3 = ROTL(v3, 21); \
+ v3 ^= v0; \
+ v2 += v1; \
+ v1 = ROTL(v1, 17); \
+ v1 ^= v2; \
+ v2 = ROTL(v2, 32); \
+ } while (0)
+
+template <int cROUNDS, int dROUNDS, class ResultTy>
+static inline ResultTy siphash(const unsigned char *in, uint64_t inlen,
+ const unsigned char (&k)[16]) {
+
+ const unsigned char *ni = (const unsigned char *)in;
+ const unsigned char *kk = (const unsigned char *)k;
+
+ static_assert(sizeof(ResultTy) == 8 || sizeof(ResultTy) == 16,
+ "result type should be uint64_t or uint128_t");
+ uint64_t v0 = UINT64_C(0x736f6d6570736575);
+ uint64_t v1 = UINT64_C(0x646f72616e646f6d);
+ uint64_t v2 = UINT64_C(0x6c7967656e657261);
+ uint64_t v3 = UINT64_C(0x7465646279746573);
+ uint64_t k0 = U8TO64_LE(kk);
+ uint64_t k1 = U8TO64_LE(kk + 8);
+ uint64_t m;
+ int i;
+ const unsigned char *end = ni + inlen - (inlen % sizeof(uint64_t));
+ const int left = inlen & 7;
+ uint64_t b = ((uint64_t)inlen) << 56;
+ v3 ^= k1;
+ v2 ^= k0;
+ v1 ^= k1;
+ v0 ^= k0;
+
+ if (sizeof(ResultTy) == 16)
+ v1 ^= 0xee;
+
+ for (; ni != end; ni += 8) {
+ m = U8TO64_LE(ni);
+ v3 ^= m;
- b = v0 ^ v1 ^ v2 ^ v3;
- U64TO8_LE(out, b);
-
- if (outlen == 8)
- return 0;
-
- v1 ^= 0xdd;
-
- TRACE;
- for (i = 0; i < dROUNDS; ++i)
- SIPROUND;
-
- b = v0 ^ v1 ^ v2 ^ v3;
- U64TO8_LE(out + 8, b);
-
- return 0;
+ for (i = 0; i < cROUNDS; ++i)
+ SIPROUND;
+
+ v0 ^= m;
+ }
+
+ switch (left) {
+ case 7:
+ b |= ((uint64_t)ni[6]) << 48;
+ LLVM_FALLTHROUGH;
+ case 6:
+ b |= ((uint64_t)ni[5]) << 40;
+ LLVM_FALLTHROUGH;
+ case 5:
+ b |= ((uint64_t)ni[4]) << 32;
+ LLVM_FALLTHROUGH;
+ case 4:
+ b |= ((uint64_t)ni[3]) << 24;
+ LLVM_FALLTHROUGH;
+ case 3:
+ b |= ((uint64_t)ni[2]) << 16;
+ LLVM_FALLTHROUGH;
+ case 2:
+ b |= ((uint64_t)ni[1]) << 8;
+ LLVM_FALLTHROUGH;
+ case 1:
+ b |= ((uint64_t)ni[0]);
+ break;
+ case 0:
+ break;
+ }
+
+ v3 ^= b;
+
+ for (i = 0; i < cROUNDS; ++i)
+ SIPROUND;
+
+ v0 ^= b;
+
+ if (sizeof(ResultTy) == 16)
+ v2 ^= 0xee;
+ else
+ v2 ^= 0xff;
+
+ for (i = 0; i < dROUNDS; ++i)
+ SIPROUND;
+
+ b = v0 ^ v1 ^ v2 ^ v3;
+
+ uint64_t firstHalf = b;
+ if (sizeof(ResultTy) == 8)
+ return firstHalf;
+
+ v1 ^= 0xdd;
+
+ for (i = 0; i < dROUNDS; ++i)
+ SIPROUND;
+
+ b = v0 ^ v1 ^ v2 ^ v3;
+ uint64_t secondHalf = b;
+
+ return firstHalf | (ResultTy(secondHalf) << (sizeof(ResultTy) == 8 ? 0 : 64));
}
``````````
</details>
https://github.com/llvm/llvm-project/pull/94394
More information about the llvm-branch-commits
mailing list