[clang] [llvm] Clang: convert `__m64` intrinsics to unconditionally use SSE2 instead of MMX. (PR #96540)

Tue Jun 25 12:52:43 PDT 2024

================
@@ -1558,10 +1559,10 @@ _mm_cvttss_si64(__m128 __a)
 /// \param __a
 ///    A 128-bit vector of [4 x float].
 /// \returns A 64-bit integer vector containing the converted values.
-static __inline__ __m64 __DEFAULT_FN_ATTRS_MMX
+static __inline__ __m64 __DEFAULT_FN_ATTRS_SSE2
 _mm_cvttps_pi32(__m128 __a)
 {
-  return (__m64)__builtin_ia32_cvttps2pi((__v4sf)__a);
+  return __trunc64(__builtin_ia32_cvttps2dq((__v4sf)__zeroupper64(__a)));
----------------
jyknight wrote:

I'm not sure: is `__builtin_convertvector` from float->int guaranteed to have the same semantics as this requires?

Even if feasible, I'd prefer to leave that change to some future work that eliminates the `__builtin_ia32_cvttps2dq` (and similar functions), since the same should be done to `_mm_cvttps_epi32`, `_mm256_cvttps_epi32`, `_mm_cvtpd_epi32`, `_mm_cvtpd_pi32`, and `_mm256_cvtpd_epi32`, at least.

https://github.com/llvm/llvm-project/pull/96540