[libc-commits] [PATCH] D137868: [libc] Fix builtin definition for memory functions

Michael Jones via Phabricator via libc-commits libc-commits at lists.llvm.org
Tue Nov 15 16:45:01 PST 2022


michaelrj added inline comments.


================
Comment at: libc/cmake/modules/LLVMLibCCheckCpuFeatures.cmake:9
 if(${LIBC_TARGET_ARCHITECTURE_IS_X86})
-  set(ALL_CPU_FEATURES SSE2 SSE4_2 AVX2 AVX512F FMA)
+  set(ALL_CPU_FEATURES SSE2 SSE4_2 AVX2 AVX512F AVX512BW FMA)
   set(LIBC_COMPILE_OPTIONS_NATIVE -march=native)
----------------
gchatelet wrote:
> Thx! We'd also need to have it added to the CMakefile.txt for the bcmp and memcmp functions
>  - https://github.com/llvm/llvm-project/blob/a4ae029b087070c43d8eb25c9240de3eb345ed63/libc/src/string/CMakeLists.txt#L358
>  - https://github.com/llvm/llvm-project/blob/a4ae029b087070c43d8eb25c9240de3eb345ed63/libc/src/string/CMakeLists.txt#L412
> 
I switched the memory functions over to `AVX512BW` instead of `AVX512F` since it seems like only BW matters for this case.


================
Comment at: libc/src/string/memory_utils/op_x86.h:29
+// functions in case the builtin is not present.
+#ifndef _mm512_cmpneq_epi8_mask
 #define _mm512_cmpneq_epi8_mask(A, B) 0
----------------
gchatelet wrote:
> lntue wrote:
> > How will this work if `_mm512_cmpneq_epi8_mask` is defined as a function or compiler builtin instead of macro?  Will `defined(_mm512_cmpneq_epi8_mask)` also return false, and the macro will overwrite the real function calls?
> So technically the right way to do it is to rely on the `__has_builtin` intrinsic.
> It is available on clang forever but is quite recent on gcc, it requires gcc 10 (another bummer to have to support both gcc and clang)
> https://gcc.gnu.org/onlinedocs/gcc-10.1.0/cpp/_005f_005fhas_005fbuiltin.html
`__has_builtin` doesn't work because this isn't the name of the builtin. From `avx512bwintrin.h`:

```
#define _mm512_cmp_epi8_mask(a, b, p) \
  ((__mmask64)__builtin_ia32_cmpb512_mask((__v64qi)(__m512i)(a), \
                                          (__v64qi)(__m512i)(b), (int)(p), \
                                          (__mmask64)-1))
```

And `#ifndef` doesn't work either because sometimes these are defined as functions, from `avx2intrin.h` (on the same system):

```
static __inline__ int __DEFAULT_FN_ATTRS256
_mm256_movemask_epi8(__m256i __a)
{
  return __builtin_ia32_pmovmskb256((__v32qi)__a);
}
```

But it appears that the problems I was having trying to build on windows previously can be fixed by using the proper extension name (AVX512BW vs AVX512F), this builds clean on both and (afaict) uses the correct builtins.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D137868/new/

https://reviews.llvm.org/D137868



More information about the libc-commits mailing list