r276417 - [X86][AVX] Added support for lowering to VBROADCASTF128/VBROADCASTI128 with generic IR
Chandler Carruth via cfe-commits
cfe-commits at lists.llvm.org
Wed Aug 10 00:54:03 PDT 2016
On Fri, Jul 22, 2016 at 7:06 AM Simon Pilgrim via cfe-commits <
cfe-commits at lists.llvm.org> wrote:
> Author: rksimon
> Date: Fri Jul 22 08:58:56 2016
> New Revision: 276417
>
> URL: http://llvm.org/viewvc/llvm-project?rev=276417&view=rev
> Log:
> [X86][AVX] Added support for lowering to VBROADCASTF128/VBROADCASTI128
> with generic IR
>
> As discussed on D22460, I've updated the vbroadcastf128 pd256/ps256
> builtins to map directly to generic IR - load+splat a 128-bit vector to
> both lanes of a 256-bit vector.
>
> Fix for PR28657.
>
> Modified:
> cfe/trunk/lib/CodeGen/CGBuiltin.cpp
> cfe/trunk/test/CodeGen/avx-builtins.c
>
> Modified: cfe/trunk/lib/CodeGen/CGBuiltin.cpp
> URL:
> http://llvm.org/viewvc/llvm-project/cfe/trunk/lib/CodeGen/CGBuiltin.cpp?rev=276417&r1=276416&r2=276417&view=diff
>
> ==============================================================================
> --- cfe/trunk/lib/CodeGen/CGBuiltin.cpp (original)
> +++ cfe/trunk/lib/CodeGen/CGBuiltin.cpp Fri Jul 22 08:58:56 2016
> @@ -6619,6 +6619,26 @@ static Value *EmitX86MaskedLoad(CodeGenF
> return CGF.Builder.CreateMaskedLoad(Ops[0], Align, MaskVec, Ops[1]);
> }
>
> +static Value *EmitX86SubVectorBroadcast(CodeGenFunction &CGF,
> + SmallVectorImpl<Value *> &Ops,
> + llvm::Type *DstTy,
> + unsigned SrcSizeInBits,
> + unsigned Align) {
> + // Load the subvector.
> + Ops[0] = CGF.Builder.CreateAlignedLoad(Ops[0], Align);
> +
> + // Create broadcast mask.
> + unsigned NumDstElts = DstTy->getVectorNumElements();
> + unsigned NumSrcElts = SrcSizeInBits / DstTy->getScalarSizeInBits();
> +
> + SmallVector<uint32_t, 8> Mask;
> + for (unsigned i = 0; i != NumDstElts; i += NumSrcElts)
> + for (unsigned j = 0; j != NumSrcElts; ++j)
> + Mask.push_back(j);
> +
> + return CGF.Builder.CreateShuffleVector(Ops[0], Ops[0], Mask,
> "subvecbcst");
> +}
> +
> static Value *EmitX86Select(CodeGenFunction &CGF,
> Value *Mask, Value *Op0, Value *Op1) {
>
> @@ -6995,6 +7015,13 @@ Value *CodeGenFunction::EmitX86BuiltinEx
>
> getContext().getTypeAlignInChars(E->getArg(1)->getType()).getQuantity();
> return EmitX86MaskedLoad(*this, Ops, Align);
> }
> +
> + case X86::BI__builtin_ia32_vbroadcastf128_pd256:
> + case X86::BI__builtin_ia32_vbroadcastf128_ps256: {
> + llvm::Type *DstTy = ConvertType(E->getType());
> + return EmitX86SubVectorBroadcast(*this, Ops, DstTy, 128, 16);
>
Somewhat to my surprise, after a bunch of debugging, we found a bug in this
line.
See my fix in r278202. I wanted to mention it here in case others bisect
back to this and wonder. And because frankly, I would never have thought of
this. The broadcast instructions, even when taking a 128-bit input, don't
have an alignment requirement here. Paint me surprised.
Anyways, just FYI and in case you want to double check my fix.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/cfe-commits/attachments/20160810/d95ef966/attachment.html>
More information about the cfe-commits
mailing list