[clang] [X86][AVX10.2] Add comments for the avx10_2convertintrin.h file (PR #120766)
Mikołaj Piróg via cfe-commits
cfe-commits at lists.llvm.org
Sat Dec 21 09:21:30 PST 2024
================
@@ -24,567 +24,3243 @@
__attribute__((__always_inline__, __nodebug__, __target__("avx10.2-256"), \
__min_vector_width__(256)))
+/// Convert two 128-bit vectors, \a __A and \a __B, containing packed
+/// single-precision (32-bit) floating-point elements to a 128-bit vector
+/// containing FP16 elements.
+///
+/// \code{.operation}
+/// FOR i := 0 to 7
+/// IF i < 4
+/// dst.fp16[i] := convert_fp32_to_fp16(__B.fp32[i])
+/// ELSE
+/// dst.fp16[i] := convert_fp32_to_fp16(__A.fp32[i - 4])
+/// FI
+/// ENDFOR
+/// \endcode
----------------
mikolaj-pirog wrote:
Recent intrinsics (amxfp8intrin.h) also follows this order, as vast majority of existing intrinsic do. It shouldn't be problematic to the tooling -- if it is, I will fix it (the tooling)
https://github.com/llvm/llvm-project/pull/120766
More information about the cfe-commits
mailing list