[clang] [Headers][X86] Use `__builtin_elementwise_ctlz` instead of avx512cd intrinsics. (PR #155089)
Simon Pilgrim via cfe-commits
cfe-commits at lists.llvm.org
Wed Aug 27 10:57:38 PDT 2025
================
@@ -63,10 +69,9 @@ _mm512_maskz_conflict_epi32 (__mmask16 __U, __m512i __A)
(__v16si)_mm512_setzero_si512());
}
-static __inline__ __m512i __DEFAULT_FN_ATTRS
-_mm512_lzcnt_epi32 (__m512i __A)
-{
- return (__m512i) __builtin_ia32_vplzcntd_512 ((__v16si) __A);
+static __inline__ __m512i __DEFAULT_FN_ATTRS_CONSTEXPR
+_mm512_lzcnt_epi32(__m512i __A) {
+ return (__m512i)__builtin_elementwise_ctlz((__v16si)__A);
----------------
RKSimon wrote:
An alternative is we keep the __builtin_ia32_vplzcnt builtins and add them to VectorExprEvaluator::VisitCallExpr instead - its annoying not to use the generics, but the __builtin_elementwise_ctlz 2 operand variant will end up generating a ctlz+icmp+select sequence that won't be great in -O0 builds - but its whether we really care about that or not.
https://github.com/llvm/llvm-project/pull/155089
More information about the cfe-commits
mailing list