[libc-commits] [libc] 5794854 - [libc][NFC] Use 16-byte indices for _mmXXX_shuffle_epi8 (#77781)

via libc-commits libc-commits at lists.llvm.org
Thu Jan 11 07:26:00 PST 2024


Author: Guillaume Chatelet
Date: 2024-01-11T16:25:55+01:00
New Revision: 5794854213375017f52914afbae09a12b9a33e06

URL: https://github.com/llvm/llvm-project/commit/5794854213375017f52914afbae09a12b9a33e06
DIFF: https://github.com/llvm/llvm-project/commit/5794854213375017f52914afbae09a12b9a33e06.diff

LOG: [libc][NFC] Use 16-byte indices for _mmXXX_shuffle_epi8 (#77781)

This is less confusing since the implementation only cares about the 4
lower bits.

Added: 
    

Modified: 
    libc/src/string/memory_utils/op_x86.h

Removed: 
    


################################################################################
diff  --git a/libc/src/string/memory_utils/op_x86.h b/libc/src/string/memory_utils/op_x86.h
index a6529a6d424a30..6ae9583627bd6d 100644
--- a/libc/src/string/memory_utils/op_x86.h
+++ b/libc/src/string/memory_utils/op_x86.h
@@ -263,13 +263,13 @@ LIBC_INLINE uint64_t big_endian_cmp_mask(__m512i max, __m512i value) {
   // 16-byte lane.
   // zmm = | 16 bytes  | 16 bytes  | 16 bytes  | 16 bytes  |
   // zmm = | <8> | <8> | <8> | <8> | <8> | <8> | <8> | <8> |
-  const __m512i indices = _mm512_set_epi8(56, 57, 58, 59, 60, 61, 62, 63, //
-                                          48, 49, 50, 51, 52, 53, 54, 55, //
-                                          40, 41, 42, 43, 44, 45, 46, 47, //
-                                          32, 33, 34, 35, 36, 37, 38, 39, //
-                                          24, 25, 26, 27, 28, 29, 30, 31, //
-                                          16, 17, 18, 19, 20, 21, 22, 23, //
-                                          8, 9, 10, 11, 12, 13, 14, 15,   //
+  const __m512i indices = _mm512_set_epi8(8, 9, 10, 11, 12, 13, 14, 15, //
+                                          0, 1, 2, 3, 4, 5, 6, 7,       //
+                                          8, 9, 10, 11, 12, 13, 14, 15, //
+                                          0, 1, 2, 3, 4, 5, 6, 7,       //
+                                          8, 9, 10, 11, 12, 13, 14, 15, //
+                                          0, 1, 2, 3, 4, 5, 6, 7,       //
+                                          8, 9, 10, 11, 12, 13, 14, 15, //
                                           0, 1, 2, 3, 4, 5, 6, 7);
   // Then we compute the mask for equal bytes. In this mask the bits of each
   // byte are already reversed but the byte themselves should be reversed, this


        


More information about the libc-commits mailing list