[PATCH] D47174: [X86] Move 128-bit f16c intrinsics to __emmintrin_f16c.h include from emmintrin.h. Move 256-bit f16c intrinsics back to f16cintrin.h

Tue May 22 13:51:23 PDT 2018

DavidKreitzer added a comment.

A bit of history: In icc, the f16<=>f32 conversion intrinsics are a bit of an anomaly in that they can be implemented using either native code or emulation code based on the target architecture switch. See https://godbolt.org/g/bQy7xY (thanks, Craig, for the example code). The emulation code lives in the Intel Math Library.

The reason icc chose to declare the scalar & 128-bit versions of the intrinsics in emmintrin.h rather than a header file that more closely corresponds to the f16c feature is that emmintrin.h contains the minimum necessary to use the emulation code, i.e. the declaration of the __m128i type.

Given that clang doesn't support the lowering of these intrinsics to emulation code, I don't see much benefit including them in emmintrin.h. It would make more sense to just put everything in f16cintrin.h and include that from immintrin.h.

In brief, I like your changes in immintrin.h. I would move the code from _emmintrin_f16c.h into f16cintrin.h. And I would remove the include from emmintrin.h. I think that would be consistent with gcc as well. We can let the emulation behavior of these intrinsics remain an icc-specific anomaly.

Repository:
  rL LLVM

https://reviews.llvm.org/D47174