[PATCH] D20782: [AVX512] Emit generic masked store intrinsics directly from clang instead of using x86 specific intrinsics.
Craig Topper via cfe-commits
cfe-commits at lists.llvm.org
Sun May 29 23:33:18 PDT 2016
craig.topper added inline comments.
================
Comment at: lib/CodeGen/CGBuiltin.cpp:6304
@@ +6303,3 @@
+ Indices[i] = i;
+ Ops[2] = CGF.Builder.CreateShuffleVector(Ops[2], Ops[2],
+ makeArrayRef(Indices, NumElts),
----------------
delena wrote:
> craig.topper wrote:
> > delena wrote:
> > > What code do you receive at the end? There is no shuffle instruction in the architecture for mask vector.
> > That's not really a shuffle. It's an extract subvector, but the IR doesn't have a real instruction for that.
> >
> > It's needed so we can go from i8 -> v8i1 -> v2i1/v4i1.
> I understand. I just wanted to be sure that you receive only one "kmov %edi, %k1" at the end.
Yes, only one "kmov %edi, %k1" was generated.
http://reviews.llvm.org/D20782
More information about the cfe-commits
mailing list