[clang] [X86][Clang] VectorExprEvaluator::VisitCallExpr / InterpretBuiltin - Allow AVX/AVX512 IFMA madd52 intrinsics to be used in constexpr (PR #161056)
Simon Pilgrim via cfe-commits
cfe-commits at lists.llvm.org
Thu Oct 2 08:50:37 PDT 2025
================
@@ -52,3 +56,35 @@ __m256i test_mm256_madd52lo_avx_epu64(__m256i __X, __m256i __Y, __m256i __Z) {
// CHECK: call {{.*}}<4 x i64> @llvm.x86.avx512.vpmadd52l.uq.256(<4 x i64> %{{.*}}, <4 x i64> %{{.*}}, <4 x i64> %{{.*}})
return _mm256_madd52lo_avx_epu64(__X, __Y, __Z);
}
+
+TEST_CONSTEXPR(match_v2di(_mm_madd52lo_epu64((__m128i)((__v2du){0, 0}), (__m128i)((__v2du){10, 0}), (__m128i)((__v2du){5, 0})), 50, 0), "mm_madd52lo_epu64: basic multiply-add low bits");
+
+TEST_CONSTEXPR(match_v2di(_mm_madd52lo_epu64((__m128i)((__v2du){100, 0}), (__m128i)((__v2du){20, 0}), (__m128i)((__v2du){30, 0})), 700, 0), "mm_madd52lo_epu64: accumulator test");
+
+TEST_CONSTEXPR(match_v2di(_mm_madd52lo_epu64((__m128i)((__v2du){1, 2}), (__m128i)((__v2du){10, 20}), (__m128i)((__v2du){2, 3})), 21, 62), "mm_madd52lo_epu64: two-lane computation");
----------------
RKSimon wrote:
these TEST_CONSTEXPR should be put below the test_mm_madd52lo_epu64 test - same for the others - the idea is that all tests for a specific intrinsic are coherent and together
https://github.com/llvm/llvm-project/pull/161056
More information about the cfe-commits
mailing list