[PATCH] [AArch64 NEON] Fix a bug when lowering BUILD_VECTOR.
Jiangning Liu
liujiangning1 at gmail.com
Tue Dec 24 00:06:27 PST 2013
The patch itself is OK for me to upstream.
However, we need some improvement in
method AArch64TargetLowering::isKnownShuffleVector. Looking at the case
below,
define <8 x i8> @combine_2v(<16 x i8> %x) #0 {
%vecext = extractelement <16 x i8> %x, i32 0
%vecinit = insertelement <8 x i8> undef, i8 %vecext, i32 0
%vecext1 = extractelement <16 x i8> %x, i32 1
%vecinit2 = insertelement <8 x i8> %vecinit, i8 %vecext1, i32 1
%vecext3 = extractelement <16 x i8> %x, i32 2
%vecinit4 = insertelement <8 x i8> %vecinit2, i8 %vecext3, i32 2
%vecext5 = extractelement <16 x i8> %x, i32 3
%vecinit6 = insertelement <8 x i8> %vecinit4, i8 %vecext5, i32 3
%vecext7 = extractelement <16 x i8> %x, i32 4
%vecinit8 = insertelement <8 x i8> %vecinit6, i8 %vecext7, i32 4
%vecext9 = extractelement <16 x i8> %x, i32 5
%vecinit10 = insertelement <8 x i8> %vecinit8, i8 %vecext9, i32 5
%vecext11 = extractelement <16 x i8> %x, i32 6
%vecinit12 = insertelement <8 x i8> %vecinit10, i8 %vecext11, i32 6
%vecext13 = extractelement <16 x i8> %x, i32 7
%vecinit14 = insertelement <8 x i8> %vecinit12, i8 %vecext13, i32 7
ret <8 x i8> %vecinit14
}
This is a quite common one, and at present this case is not optimal, so we
need to give some more follow-ups.
Thanks,
-Jiangning
2013/12/20 Kevin Qin <kevinqindev at gmail.com>
> Hi,
>
> DAG.getVectorShuffle() doesn't always return a vector_shuffle node. If
> mask is the exact sequence of it's operand(For example, operand_0 is v8i8,
> and the mask is 0, 1, 2, 3, 4, 5, 6, 7), it will directly return that
> operand. So a check is added here. Please review.
>
> http://llvm-reviews.chandlerc.com/D2452
>
> Files:
> lib/Target/AArch64/AArch64ISelLowering.cpp
> test/CodeGen/AArch64/neon-copy.ll
>
> Index: lib/Target/AArch64/AArch64ISelLowering.cpp
> ===================================================================
> --- lib/Target/AArch64/AArch64ISelLowering.cpp
> +++ lib/Target/AArch64/AArch64ISelLowering.cpp
> @@ -3957,7 +3957,10 @@
> if (V1.getNode() && NumElts == V0NumElts &&
> V0NumElts == V1.getValueType().getVectorNumElements()) {
> SDValue Shuffle = DAG.getVectorShuffle(VT, DL, V0, V1, Mask);
> - Res = LowerVECTOR_SHUFFLE(Shuffle, DAG);
> + if(Shuffle.getOpcode() != ISD::VECTOR_SHUFFLE)
> + Res = Shuffle;
> + else
> + Res = LowerVECTOR_SHUFFLE(Shuffle, DAG);
> return true;
> } else
> return false;
> Index: test/CodeGen/AArch64/neon-copy.ll
> ===================================================================
> --- test/CodeGen/AArch64/neon-copy.ll
> +++ test/CodeGen/AArch64/neon-copy.ll
> @@ -703,4 +703,26 @@
> %e = insertelement <4 x i32> %d, i32 %b, i32 2
> %f = insertelement <4 x i32> %e, i32 %b, i32 3
> ret <4 x i32> %f
> -}
> \ No newline at end of file
> +}
> +
> +define <8 x i8> @getl(<16 x i8> %x) #0 {
> +; CHECK-LABEL: getl:
> +; CHECK: ret
> + %vecext = extractelement <16 x i8> %x, i32 0
> + %vecinit = insertelement <8 x i8> undef, i8 %vecext, i32 0
> + %vecext1 = extractelement <16 x i8> %x, i32 1
> + %vecinit2 = insertelement <8 x i8> %vecinit, i8 %vecext1, i32 1
> + %vecext3 = extractelement <16 x i8> %x, i32 2
> + %vecinit4 = insertelement <8 x i8> %vecinit2, i8 %vecext3, i32 2
> + %vecext5 = extractelement <16 x i8> %x, i32 3
> + %vecinit6 = insertelement <8 x i8> %vecinit4, i8 %vecext5, i32 3
> + %vecext7 = extractelement <16 x i8> %x, i32 4
> + %vecinit8 = insertelement <8 x i8> %vecinit6, i8 %vecext7, i32 4
> + %vecext9 = extractelement <16 x i8> %x, i32 5
> + %vecinit10 = insertelement <8 x i8> %vecinit8, i8 %vecext9, i32 5
> + %vecext11 = extractelement <16 x i8> %x, i32 6
> + %vecinit12 = insertelement <8 x i8> %vecinit10, i8 %vecext11, i32 6
> + %vecext13 = extractelement <16 x i8> %x, i32 7
> + %vecinit14 = insertelement <8 x i8> %vecinit12, i8 %vecext13, i32 7
> + ret <8 x i8> %vecinit14
> +}
>
> _______________________________________________
> llvm-commits mailing list
> llvm-commits at cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
>
>
--
Thanks,
-Jiangning
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20131224/21f1756e/attachment.html>
More information about the llvm-commits
mailing list