[PATCH] D12052: [X86][SSE] Add _mm_undefined_* intrinsics
Kuperstein, Michael M via cfe-commits
cfe-commits at lists.llvm.org
Tue Aug 18 00:41:54 PDT 2015
I’m not sure how much people actually use these, but the AVX-512 versions of these, at least, can be very useful internally to implement AVX-512 intrinsics.
For AVX-512, we use the same GCC builtin for all 3 versions of the intrinsic (pass-through masked, set to zero masked, and unmasked). This is the same implementation that’s used in GCC, and is fairly clean, since the only difference is in the desired pass-through values (actual value, zero, or undef).
However, since we don’t actually have the undef intrinsics right now, we put a zero in the unmasked version as well, which is definitely a pessimization.
The plan is to change them to use undef once the undef intrinsics are implemented.
From: Eric Christopher [mailto:echristo at gmail.com]
Sent: Monday, August 17, 2015 21:33
To: reviews+D12052+public+a6057f04f570e35c at reviews.llvm.org; llvm-dev at redking.me.uk; craig.topper at gmail.com; Kuperstein, Michael M
Cc: david.majnemer at gmail.com; Badouh, Asaf; cfe-commits at lists.llvm.org; Richard Smith
Subject: Re: [PATCH] D12052: [X86][SSE] Add _mm_undefined_* intrinsics
On Sun, Aug 16, 2015 at 3:05 AM Simon Pilgrim <llvm-dev at redking.me.uk<mailto:llvm-dev at redking.me.uk>> wrote:
RKSimon added a comment.
Yes using that uninitialized value has worried me as well. I originally set it to zero (and considered using __ LINE __ or __ COUNTER __) but both introduce defined behaviour that I could see causing all sorts of problems further down the line in debug vs release builds. How undefined do we want our undefined to be? ;-)
Yeah, this is why I hadn't implemented them yet either.
I can create __builtin_ia32_undef64mmx / __builtin_ia32_undef128 / __builtin_ia32_undef256 / __builtin_ia32_undef512 if nobody can think of a better alternative?
This seems fairly heavyweight, but I don't have any better ideas. I'll assume we don't want to try to expose undef as a value in C (making it as something we could just add), if not then this seems to make the most sense. It's pretty painful/ugly though.
Are people using these or did they just notice for completeness? We probably _could_ define them to zero and leave it at that. It's not pleasant and slower than it needs to be but not crazy.
-eric
Repository:
rL LLVM
http://reviews.llvm.org/D12052
---------------------------------------------------------------------
Intel Israel (74) Limited
This e-mail and any attachments may contain confidential material for
the sole use of the intended recipient(s). Any review or distribution
by others is strictly prohibited. If you are not the intended
recipient, please contact the sender and delete all copies.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/cfe-commits/attachments/20150818/2a74c9d1/attachment-0001.html>
More information about the cfe-commits
mailing list