[PATCH] D13776: [x86] Fix wrong prototypes for AVX mask load/store intrinsics.

Thu Oct 15 09:02:45 PDT 2015

andreadb created this revision.
andreadb added reviewers: qcolombet, rnk, nadav, bruno, mkuper.
andreadb added a subscriber: llvm-commits.

The llvm types used for the mask operand of AVX maskload/maskstore intrinsics are incorrect.

In particular:
1) The mask argument for __builtin_ia32_maskloadpd and  __builtin_ia32_maskstorepd should be of type llvm_v2i64_ty and not llvm_v2f64_ty.
2) The mask argument for __builtin_ia32_maskloadpd256 and  __builtin_ia32_maskstorepd256 should be of type llvm_v4i64_ty and not llvm_v4f64_ty.
3) The mask argument for __builtin_ia32_maskloadps and  __builtin_ia32_maskstoreps should be of type llvm_v4i32_ty and not llvm_v4f32_ty.
4) The mask argument for __builtin_ia32_maskloadps256 and  __builtin_ia32_maskstoreps256 should be of type llvm_v8i32_ty and not llvm_v8f32_ty.

Basically, the mask type for maskload/maskstore GCC builtins is never a vector of packed floats/doubles.

I also noticed that Clang definitions for those builtins are incorrect in BuiltinsX86.def. Also, Clang header file avxintrin.h definitions for maskload/maskstore intrinsics wrongly use packed floats/doubles instead of packed int/long for the mask operands.

For example, _mm_maskstore_pd is currently defined in avxintrin.h as:

///
static __inline __m256 __DEFAULT_FN_ATTRS
_mm_maskstore_pd(double *__p, __m128d __m, __m128d __a)
{
  __builtin_ia32_maskstorepd((__v2df *)__p, (__v2df)__m, (__v2df)__a);
}
///

According to the Intel documentation, the correct prototype for _mm_maskstore_pd should be:
_mm_maskstore_pd(double *__p, __m128i __m, __m128d __a).

So, I think the definition should be something like:

///
static __inline __m256 __DEFAULT_FN_ATTRS
_mm_maskstore_pd(double *__p, __m128i __m, __m128d __a)
{
  __builtin_ia32_maskstorepd((__v2df *)__p, (__v2di)__m, (__v2df)__a);
}
///

If you agree with this patch, I plan to send a follow-on patch (this time a  Clang patch) to also fix intrinsic header file avxintrin.h (and the prototype definitions for the x86 maskload/store builtins in BuiltinsX86.def).

Please let me know if okay to submit.

-Andrea

http://reviews.llvm.org/D13776

Files:
  include/llvm/IR/IntrinsicsX86.td
  test/CodeGen/X86/avx-intrinsics-x86.ll
  test/CodeGen/X86/avx-load-store.ll
  test/CodeGen/X86/avx-win64.ll

-------------- next part --------------
A non-text attachment was scrubbed...
Name: D13776.37488.patch
Type: text/x-patch
Size: 10826 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20151015/799cdda6/attachment.bin>