[PATCH] D13776: [x86] Fix wrong prototypes for AVX mask load/store intrinsics.
Andrea Di Biagio via llvm-commits
llvm-commits at lists.llvm.org
Thu Oct 15 09:02:45 PDT 2015
andreadb created this revision.
andreadb added reviewers: qcolombet, rnk, nadav, bruno, mkuper.
andreadb added a subscriber: llvm-commits.
The llvm types used for the mask operand of AVX maskload/maskstore intrinsics are incorrect.
In particular:
1) The mask argument for __builtin_ia32_maskloadpd and __builtin_ia32_maskstorepd should be of type llvm_v2i64_ty and not llvm_v2f64_ty.
2) The mask argument for __builtin_ia32_maskloadpd256 and __builtin_ia32_maskstorepd256 should be of type llvm_v4i64_ty and not llvm_v4f64_ty.
3) The mask argument for __builtin_ia32_maskloadps and __builtin_ia32_maskstoreps should be of type llvm_v4i32_ty and not llvm_v4f32_ty.
4) The mask argument for __builtin_ia32_maskloadps256 and __builtin_ia32_maskstoreps256 should be of type llvm_v8i32_ty and not llvm_v8f32_ty.
Basically, the mask type for maskload/maskstore GCC builtins is never a vector of packed floats/doubles.
I also noticed that Clang definitions for those builtins are incorrect in BuiltinsX86.def. Also, Clang header file avxintrin.h definitions for maskload/maskstore intrinsics wrongly use packed floats/doubles instead of packed int/long for the mask operands.
For example, _mm_maskstore_pd is currently defined in avxintrin.h as:
///
static __inline __m256 __DEFAULT_FN_ATTRS
_mm_maskstore_pd(double *__p, __m128d __m, __m128d __a)
{
__builtin_ia32_maskstorepd((__v2df *)__p, (__v2df)__m, (__v2df)__a);
}
///
According to the Intel documentation, the correct prototype for _mm_maskstore_pd should be:
_mm_maskstore_pd(double *__p, __m128i __m, __m128d __a).
So, I think the definition should be something like:
///
static __inline __m256 __DEFAULT_FN_ATTRS
_mm_maskstore_pd(double *__p, __m128i __m, __m128d __a)
{
__builtin_ia32_maskstorepd((__v2df *)__p, (__v2di)__m, (__v2df)__a);
}
///
If you agree with this patch, I plan to send a follow-on patch (this time a Clang patch) to also fix intrinsic header file avxintrin.h (and the prototype definitions for the x86 maskload/store builtins in BuiltinsX86.def).
Please let me know if okay to submit.
-Andrea
http://reviews.llvm.org/D13776
Files:
include/llvm/IR/IntrinsicsX86.td
test/CodeGen/X86/avx-intrinsics-x86.ll
test/CodeGen/X86/avx-load-store.ll
test/CodeGen/X86/avx-win64.ll
-------------- next part --------------
A non-text attachment was scrubbed...
Name: D13776.37488.patch
Type: text/x-patch
Size: 10826 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20151015/799cdda6/attachment.bin>
More information about the llvm-commits
mailing list