[libcxx-commits] [PATCH] D144362: [libcxx] Updated <experimental/simd> based on Parallelism-TS N4808

Mon Mar 13 00:34:25 PDT 2023

Joy12138 added inline comments.

================
Comment at: libcxx/include/experimental/simd:85-93
+  // class template simd [simd.class]
+  template<class _Tp, class _Abi = simd_abi::compatible<_Tp>> class simd;
+  template<class _Tp> using native_simd = simd<_Tp, simd_abi::native<_Tp>>;
+  template<class _Tp, int _Np> using fixed_size_simd = simd<_Tp, simd_abi::fixed_size<_Np>>;
+
+  // class template simd_mask [simd.mask.class]
+  template<class _Tp, class _Abi = simd_abi::compatible<_Tp>> class simd_mask;
----------------
philnik wrote:
> These should probably get their own headers. I guess the classes are not super small.
In our design of the SIMD library structure, these specializations of the `simd/simd_mask` will not have their own headers. This is because the implementations are not divided by the ABI tags of the user interface layer, but by the internal ABI tags. These internal tags do not correspond one-to-one with the user interface layer tags.

Let me briefly introduce the current structure design:

We divide the structure into three layers, with the specific implementations in the second and third layers. Each layer uses the ABI tags corresponding to that layer for template specialization.

The first layer is the external user interface layer, which corresponds to the external ABI tags given in the specification, such as `scalar`, `native`, `compatible`, and `fixed_size`.

The second layer is the common internal implementation layer, which is platform-independent. This layer corresponds to the internal ABI tags `__scalar` and `__vec_ext`. The implementation of the `__vec_ext` ABI tag uses Clang vector extensions (GCC builtins) as the storage type and Clang vector operators and built-ins to implement vector operations. We have tested this on various platforms and confirmed that the implementation of `__vec_ext` can correctly generate SIMD instructions supported by the current platform. We believe this will be a common solution for all platforms and will also serve as a fallback when the platform does not support SIMD.

The third layer is the platform-dependent optimization layer. Since the common layer implementation can correctly generate SIMD instructions, we only need to use some unique platform features or instructions to optimize parts of the implementation with poor performance on the common layer. This work has not yet started, and we may use some platform-specific internal ABI tags such as `__AVX`, `__NEON`, etc. at that time.

Therefore,

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D144362/new/

https://reviews.llvm.org/D144362