[PATCH] ARMEB: Fix trunc store for vector types
Christian Pirker
cpirker at a-bix.com
Fri Jun 13 08:19:56 PDT 2014
Hi all,
The ARM backend transforms a trunc store to shuffle and store operations.
A vector (or multiple scalars) is packed in one (wide) place to be stored in less memory operations.
The shuffle operation is utilized to extract some values (narrowed elements) to mimic the trunc operation.
Currently, the shuffle operation assumes little endian byte order.
This patch is calculating the shuffle indices for the least significant data (trunc) based upon the "higher side" of vector element.
Please review.
Thanks,
Christian
http://reviews.llvm.org/D4135
Files:
lib/Target/ARM/ARMISelLowering.cpp
test/CodeGen/ARM/big-endian-neon-trunc-store.ll
Index: lib/Target/ARM/ARMISelLowering.cpp
===================================================================
--- lib/Target/ARM/ARMISelLowering.cpp
+++ lib/Target/ARM/ARMISelLowering.cpp
@@ -8464,7 +8464,8 @@
SDLoc DL(St);
SDValue WideVec = DAG.getNode(ISD::BITCAST, DL, WideVecVT, StVal);
SmallVector<int, 8> ShuffleVec(NumElems * SizeRatio, -1);
- for (unsigned i = 0; i < NumElems; ++i) ShuffleVec[i] = i * SizeRatio;
+ for (unsigned i = 0; i < NumElems; ++i)
+ ShuffleVec[i] = TLI.isBigEndian() ? (i+1) * SizeRatio - 1 : i * SizeRatio;
// Can't shuffle using an illegal type.
if (!TLI.isTypeLegal(WideVecVT)) return SDValue();
Index: test/CodeGen/ARM/big-endian-neon-trunc-store.ll
===================================================================
--- test/CodeGen/ARM/big-endian-neon-trunc-store.ll
+++ test/CodeGen/ARM/big-endian-neon-trunc-store.ll
@@ -0,0 +1,26 @@
+; RUN: llc < %s -mtriple armeb-eabi -mattr v7,neon -o - | FileCheck %s
+
+define void @vector_trunc_store_2i64_to_2i16( <2 x i64>* %loadaddr, <2 x i16>* %storeaddr ) {
+; CHECK-LABEL: vector_trunc_store_2i64_to_2i16:
+; CHECK: vmovn.i64 [[REG:d[0-9]+]]
+; CHECK: vrev32.16 [[REG]], [[REG]]
+; CHECK: vuzp.16 [[REG]], [[REG2:d[0-9]+]]
+; CHECK: vrev32.16 [[REG]], [[REG2]]
+ %1 = load <2 x i64>* %loadaddr
+ %2 = trunc <2 x i64> %1 to <2 x i16>
+ store <2 x i16> %2, <2 x i16>* %storeaddr
+ ret void
+}
+
+define void @vector_trunc_store_4i32_to_4i8( <4 x i32>* %loadaddr, <4 x i8>* %storeaddr ) {
+; CHECK-LABEL: vector_trunc_store_4i32_to_4i8:
+; CHECK: vmovn.i32 [[REG:d[0-9]+]]
+; CHECK: vrev16.8 [[REG]], [[REG]]
+; CHECK: vuzp.8 [[REG]], [[REG2:d[0-9]+]]
+; CHECK: vrev32.8 [[REG]], [[REG2]]
+ %1 = load <4 x i32>* %loadaddr
+ %2 = trunc <4 x i32> %1 to <4 x i8>
+ store <4 x i8> %2, <4 x i8>* %storeaddr
+ ret void
+}
+
-------------- next part --------------
A non-text attachment was scrubbed...
Name: D4135.10391.patch
Type: text/x-patch
Size: 1908 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20140613/48181a40/attachment.bin>
More information about the llvm-commits
mailing list